Global Wordnet Association Proposal for Google Season of Docs


The Global WordNet Association (GWA) is a free, public and non-commercial organization that provides a platform for discussing, sharing and connecting semantic lexicons (wordnets) for all languages in the world. Our goal is to make compatible, linked, open lexical resources useful for both humans and computers. As an example, here is the entry for wordnet in the Open Multilingual Wordnet (OMW 1.0) and in the development branch of OMW 2.0.

The English Wordnet is extremely widely used, and so is the multilingual data from the OMW 1.0, which is distributed with the Python Natural Language Toolkit, used by Google Translate, Babelnet and many more. The documentation we are building here is aiming at the new version (OMW 2.0) which makes the semantic network less biased towards English. We have high quality contributions for over 40 languages (not all on the website yet).

We are eager to welcome tech writing collaborators through the Google Season of Docs program to help us with our documentation needs! GWA is a collaboration of many projects. We have selected three for this proposal, and the application has been blessed by the GWA board (five members of which are involved in the proposal).

We have been selected to participate in 2020!
If you are interested in working on this project, email us!

.

Ideas we invite tech writers to work with us on:

These tasks should all start with a documentation audit, to see what we do and don't have already.


If you are interested to know more about these project ideas, please email wngsodoc@gmail.com and the mentors of each project and we'll reply as soon as we can.


Detailed Descriptions:


Project name: Wordnet Structure



The goal is to make sure all the data-types (semantic relations, parts-of-speech, verb patterns, ...) are properly documented. The goal is that both dictionary users and dictionary developers can access this easily, and we have a procedure for adding more documentation as needed.

Description

We have started to make a central repository of documentation for, e.g., semantic relations: gwadoc, but it needs a lot of work. We are trying to keep UX strings together with the documentation, to make it easier to keep them in sync.

The goal is to have the documentation dynamically produced with the lexicons, so that we know we always have the same set of relations, and we can pull in examples directly from the database.

We have also started adding more documentation to the development branch of our new OMW interface, but it has not been deployed yet: here is a test server.

Related Material

Link to the open source project that needs documentation: GWADOC, and some more documentation in the development branch of our new OMW interface.

We have started with documentation for the core semantic relations, such as hypernym. The documentation needs to be fleshed out more, with descriptions of the meanings, linguistic tests, examples and more.

Currently our documentation is both incomplete, inconsistent and overly technical when it exists. It would also be good to have more links to the original documentation.

Documentation on relations is spread over many sources, from the original man pages for Princeton Wordnet, to general guidelines (this is from EuroWordnet), books (this is from the Polish Wordnet) and papers.

We often add new information (like additional semantic relations), so it would be good to have a template for adding new relations.

Finally (or perhaps initially), it would be good to have a documentation audit to see what information is missing.

Sample

To help get an idea of what the task involves, please take a look at the definitions of semantic relations in GWADOC. Then based on the documentation linked to above, can you try to add some documentation for the Meronym/Holonym relations? When you have done that, either make a pull request or email your sample to wngsodoc@gmail.com. Please feel free to improve on both the content and the appearance.

Desired Skills

Nice to have:
Mentors:

Ewa Rudnicka, German Rigau, Francis Bond


Project name: The Open Multilingual Wordnet


The OMW is relatively complicated, and it would be good to have a couple of user guides.

Description

The Open Multilingual Wordnet provides access to open wordnets in a variety of languages, all linked to a collaborative Interlingual Index. The goal is to make it easy to use wordnets in multiple languages. The individual wordnets have been made by many different projects and vary greatly in size and accuracy. We have defined a common interchange format (GWA-LMF: Lexical markup framework) and a website where people can upload and make new wordnets accessible. The Open Multilingual Wordnet and its components are open: they can be freely used, modified, and shared by anyone for any purpose.

The wordnets are all developed independently, although all based on the original Princeton Wordnet of English. They are also typically made in projects with finite funding windows, and not much money for maintenance. Because of this, documentation is spread all over the world: as technical reports, academic papers, theses and more.

Related Material

Link to the open source project that needs documentation.
Open Multilingual Wordnet Version 2.0

These are some of the areas that we think need improvement.

A guide for someone contributing a new wordnet (or family of wordnets)

Existing Documentation

A guide for someone searching the wordnet (or family of wordnets)

There is no existing documentation (we hope it is sort of intuitive, but suspect it may not be to outsiders). Here are some examples.

A guide for interacting with the Collaborative InterLingual Index (CILI)

Some documentation on CILI in the developer branch on github.

Sample

To help get an idea of what the task involves, please try to write some documentation for the concept page (e.g. Page for the concept software documentation). Maybe something similar to the explanation for the Japanese WordNet (sorry it is in Japanese)? You could also suggest ways that the search result page could be improved. When you have done that, either make a pull request or email your sample to wngsodoc@gmail.com. Please feel free to improve on both the content and the appearance.

Skills

Nice to have:
Mentors:
Francis Bond, Alexandre Rademaker

Project Name: How to contribute to wordnet

The goal is to let contributors know how to add new information (words, senses and synsets). It would be good to have (i) a general guide and (ii) a guide specific to the English wordnet.

Background

We have a very rough guide at NTU: http://compling.hss.ntu.edu.sg/ntumc/tagdoc.html

The Polish wordnet project and Eurowordnet also have extensive documentation: general guidelines from EuroWordnet, The Polish wordnet book.

Requires:

An illustrated step-by-step guides for:

There will be a need for language specific extensions, but we want to start by targeting English, specifically the English Wordnet.

Current Documentation

Skills

Sample

To help get an idea of what the task involves, could you try to make a rough guide for someone trying to

You could maybe reference the wiktionary documentation: Criteria for inclusion and Entry layout. Their structure is different from ours, but it gives a good idea of the kinds of information. When you have done that, please email your sample to wngsodoc@gmail.com. Please feel free to improve on both the content and the appearance. This is probably the hardest task, and to do properly will require discussion with the wordnet developers. For the sample, maybe just sketch out general guidelines.

Nice to have:
Mentors
John McCrae, Francis Bond

Acknowledgments

Proposal format very much inspired by Kolibri's from 2019, thanks to the GWA board for discussion, and the GSoDoc organisers for the chats.


‚úČ Francis Bond and Alexandre Rademaker (Organization administrators)
GWA Documentation Working Group
Global Wordnet Association
This is hosted at github: https://github.com/globalwordnet/doc