-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathindex.html
executable file
·205 lines (191 loc) · 10.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
<!DOCTYPE HTML>
<!--
Editorial by HTML5 UP
html5up.net | @ajlkn
Free for personal and commercial use under the CCA 3.0 license (html5up.net/license)
-->
<html>
<head>
<title>Corpling@GU</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no" />
<!--[if lte IE 8]><script src="assets/js/ie/html5shiv.js"></script><![endif]-->
<link rel="stylesheet" href="assets/css/main.css" />
<link rel="canonical" href="https://gucorpling.org/corpling/" />
<!--[if lte IE 9]><link rel="stylesheet" href="assets/css/ie9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="assets/css/ie8.css" /><![endif]-->
</head>
<body id="home">
<!-- Wrapper -->
<div id="wrapper">
<!-- Main -->
<div id="main">
<div class="inner">
<!-- Header -->
<header id="header">
<a href="https://gucorpling.org/corpling/" class="logo"><strong><i class="fa fa-home"> </i>Corpling@GU</strong></a>
<ul class="icons">
<li><a href="https://wiki.gucorpling.org/" class="icon fa-wikipedia-w"><span class="label">Wiki</span></a></li>
<li><a href="https://corpus-tools.org/" class="icon fa icon-annis"><span class="label">Corpus-tools.org</span></a></li>
<li><a href="https://www.github.com/gucorpling/" class="icon fa-github"><span class="label">GitHub</span></a></li>
<!--<li><a href="" >
<span class="iconstack fa-stack">
<i class="iconstack fa fa-search fa-stack-2x"></i>
<strong class="fa-stack-1x calendar-text">CQP</strong>
</span></a></li>-->
</ul>
</header>
<!-- Banner -->
<section id="banner">
<div class="content">
<header>
<h1>Welcome to the Corpling Lab!</h1>
<p>Computational Linguistics with & for Corpora</p>
</header>
<p><b>Corpling@GU</b> is <a href="https://www.georgetown.edu/">Georgetown University</a>'s Computational Corpus Linguistics lab at the <a href="http://linguistics.georgetown.edu/">Department of Linguistics</a>.
We specialize in building:
</p>
<ul>
<li>Corpora</li>
<li>Annotation tools</li>
<li>NLP tools</li>
</ul>
<p>We are also part of the larger Georgetown University Computational Linguistics (<a href="http://gucl.georgetown.edu/">GUCL</a>) community!</p>
<ul class="actions">
<li><a href="people.html" class="button big">Learn More</a></li>
</ul>
</div>
<span class="image object">
<img src="images/GU_campus.png" style="height: auto" alt="Georgetown University - from Wikipedia" />
</span>
</section>
<!-- Section -->
<section>
<header class="major">
<h2>Projects</h2>
</header>
<div class="features">
<article>
<span class="icon fa-search"></span>
<div class="content">
<h3>Search interfaces</h3>
<p>The lab maintains two corpus search interfaces, which offer students and the general public access to language data and statistical analysis tools, as well as an online dictionary: </p>
<ul>
<li>The <a href="https://gucorpling.org/cqp/">GU CQPWeb interface</a> for large, flat annotated corpora</li>
<li><a href="https://gucorpling.org/annis-corpora/">ANNIS</a>, search and visualization for richly annotated multilayer corpora</li>
<li><a href="https://coptic-dictionary.org/">Coptic Dictionary Online</a> - a Coptic lexicon linked to corpora and frequency data</li>
<li><a href="http://data.copticscriptorium.org/">Coptic Scriptorium Repository</a> - browsable ancient Coptic texts with linguistic analyses</li>
</ul>
</div>
</article>
<article>
<span class="icon fa-cogs"></span>
<div class="content">
<h3>NLP tools</h3>
<p>We develop a number of NLP tools that help to build corpora automatically, or feed into manual correction loops:</p>
<ul>
<li><a href="https://github.com/amir-zeldes/rftokenizer">RFTokenizer</a> - a SOTA trainable segmenter for morphologically rich languages</li>
<li><a href="https://github.com/gucorpling/DisCoDisCo">DisCoDisCo</a> - discourse relation classification, segmentation and detection</li>
<li><a href="https://tools.copticscriptorium.org/coptic-nlp/">Coptic NLP</a> - a complete pipeline for processing Coptic data</li>
<li><a href="https://gucorpling.org/xrenner/">xrenner</a> - multilingual non-named entity and coreference resolution</li>
<li><a href="https://github.com/amir-zeldes/HebPipe/">HebPipe</a> - an NLP pipeline for Hebrew</li>
</ul>
</div>
</article> <article>
<span class="icon fa-edit"></span>
<div class="content">
<h3>Annotation tools</h3>
<p>We provide a number of freely available annotation tools:</p>
<ul>
<li><a href="https://gucorpling.org/rstweb/info/">rstWeb</a> - open source web interface for Rhetorical Structure Theory annotation</li>
<li><a href="https://gucorpling.org/gitdox/">GitDox</a> - a version controlled, online XML and spreadsheet editor with built-in validation</li>
<li><a href="https://gucorpling.org/depedit/">DepEdit</a> - configurable rule-based editing for dependency corpora in the conll-u format</li>
</ul>
</div>
</article>
<article>
<span class="icon fa-files-o"></span>
<div class="content">
<h3>Corpora</h3>
<p>Several of our corpora are freely available, open source projects:</p>
<ul>
<li><a href="https://gucorpling.org/gum/">GUM</a> - The Georgetown University Multilayer corpus, created and published by our students in <a href="https://myaccess.georgetown.edu/pls/bninbp/bwckctlg.p_disp_course_detail?cat_term_in=201730&subj_code_in=LING&crse_numb_in=367">LING-367</a></li>
<li><a href="https://github.com/universalDependencies/UD_Hebrew-IAHLTWiki">UD Hebrew IAHLTwiki</a> - a new <a href="https://universaldependencies.org">UD</a> treebank of Hebrew from Wikipedia</li>
<li><a href="https://data.copticscriptorium.org">Coptic Corpora</a> under CC licenses, including the <a href="http://copticscriptorium.org/treebank.html">Coptic Treebank</a></li>
<li><a href="https://gucorpling.org/gum/amalgum.html">AMALGUM</a> - A Machine-Annotated Lookalike of GUM, a freely available, genre-balanced English web corpus totaling 4M tokens and featuring a large number of high-quality automatic annotation layers</li>
</ul>
</div>
</article>
</div>
</section>
<!-- Section -->
<section>
<header class="major">
<h2>Featured research</h2>
</header>
<div class="posts">
<article>
<a href="research/2021-02-01_gum7.html" class="image"><img src="images/gum7_header.png" alt="GUM7"/></a>
<h3>GUM7 – four added genres, Wikification and more!</h3>
<p>
The first release of <a href="https://gucorpling.org/gum/">GUM series 7</a> now adds four more genres to our multilayer corpus, in addition to brand new annotation layers, corrections, and more.
</p>
<ul class="actions">
<li><a href="research/2021-02-01_gum7.html" class="button">More</a></li>
</ul>
</article>
<article>
<a href="research/2020-06-01_entities_in_the_coptic_treebank.html" class="image"><img src="images/entities.png" alt="Coptic Entities"/></a>
<h3>Entities in the Coptic Treebank</h3>
<p>
Read more about the addition of entity annotations to <a href="https://github.com/UniversalDependencies/UD_Coptic-Scriptorium/">the Coptic Universal Dependencies Treebank</a>.
</p>
<ul class="actions">
<li><a href="research/2020-06-01_entities_in_the_coptic_treebank.html" class="button">More</a></li>
</ul>
</article>
<article>
<a href="research/2018-12-05_new_features_coptic_nlp.html" class="image"><img src="images/ebol_hn.png" alt="New features in our Coptic NLP pipeline" /></a>
<h3>New features in our Coptic NLP pipeline</h3>
<p>
Coptic Scriptorium’s Natural Language Processing (NLP) tools now support two new features...
</p>
<ul class="actions">
<li><a href="research/2018-12-05_new_features_coptic_nlp.html" class="button">More</a></li>
</ul>
</article>
<article>
<a href="research/2018-06-01_signal_rnns.html" class="image"><img src="images/signal_rnn.png" alt="RNN reads newspaper for discourse signals" /></a>
<h3>A neural network reads the newspaper...</h3>
<p>... in search of discourse signals! We now know a lot about what cues people use to identify discourse relations, but can we teach computers to notice the same signals?</p>
<ul class="actions">
<li><a href="research/2018-06-01_signal_rnns.html" class="button">More</a></li>
</ul>
</article>
<!--<article>
<a href="research/2018-05-18_rst_heatmaps.html" class="image"><img src="images/rst_heatmap.png" alt="RST referential heatmap" /></a>
<h3>What you say where - a discourse heatmap</h3>
<p>
Does discourse structure constrain where we talk about what? Research on recurring mentions within discourse graphs shows
back-reference is sensitive to the reasons why sentences and groups of sentences are uttered.
</p>
<ul class="actions">
<li><a href="research/2018-05-18_rst_heatmaps.html" class="button">More</a></li>
</ul>
</article>-->
</div>
<a href="research.html" class="button">More research</a>
</section>
</div>
</div>
<!--#include file="menu.html"-->
</div>
<!-- Scripts -->
<script src="assets/js/jquery.min.js"></script>
<script src="assets/js/skel.min.js"></script>
<script src="assets/js/util.js"></script>
<!--[if lte IE 8]><script src="assets/js/ie/respond.min.js"></script><![endif]-->
<script src="assets/js/main.js"></script>
</script>
</body>
</html>