1st Summer Datathon on Linguistic Linked Open Data (SD-LLOD-15)

The 1st Summer Datathon on Linguistic Linked Open Data (SD-LLOD-15) will be held from June 15th to 19th 2015 at Residencia Lucas Olazábal of Universidad Politécnica de Madrid, Cercedilla, Madrid.

The SD-LLOD datathon has the main goal of giving people from industry and academia practical knowledge in the field of Linked Data applied to Linguistics. The final aim is to allow participants to migrate their own (or other's) linguistic data and publish them as Linked Data on the Web. This datathon is the first organized in this topic worldwide and is supported by the LIDER FP7 Support Action

Organized by

Partner with

Supported by

Outcomes

During the datathon, participants will work on:

  • Generating and publishing their own Linguistic Linked Data from some existing data source
  • Applying Linked Data principles and Semantic Web technologies (Ontologies, RDF, Linked Data) and into the field of language resources
  • Using the principal models used for representing Linguistic Linked Data, in particular lemon and NIF
  • Performing Multilingual Word Sense Disambiguation and Entity Linking for the Web of Data
  • Demonstrating potential benefits and applications of Linguistic Linked Data for specific use cases

Topics

During the datathon, seminars will be organised to cover:

  • Ontologies and Linked Data
  • The Lexicon Model for Ontologies (lemon)
  • The NIF format for Integrating NLP with Linked Data and RDF
  • Guidelines for RDF generation of Language Resources
  • Methodologies for Linked Data publication of Language Resources
  • Multilingual Word Sense Disambiguation and Entity Linking with BabelNet
  • Use and Applications of Linguistic Linked Data
  • Metadata and Licenses for Linguistic Linked Data

With the objective of avoiding passive learning, the program of the summer datathon will contain three types of sessions:

      Seminars to show novel aspects and discuss selected topics
      Practical sessions to introduce the basic foundations of each topic, methods, and technologies and where participants will perform different tasks using the methods and technologies presented
      Hacking sessions where participants will follow the whole process of generating and publishing Linguistic Linked Data with some existing data set

Participants are expected to bring to the datathon some dataset of linguistic data produced by their organizations in order to work on it during the hacking sessions and transformed into Linked Data. For participants who cannot provide their own linguistic dataset, the organisers will provide one.
Participants will be provided with digital copies of all the material used during the course and will have access to computers with all the required software pre-installed.

Pre-requirements

We welcome participants from anywhere in the world and coming from industry or academia. Some basic acquaintance with software development and Web technologies is required. Participants are expected to participate fully in the activities of the datathon until its conclusion.

Programme

Invited talks

Rodolfo Maslias “Institutional Terminology, Tools and Communication”. Monday 15th 9:30 Abstract - Slides
Christian Chiarcos “Linked Open Dictionaries (LiODi) Lexical and phonological search in multilingual dictionaries”. Tuesday 17th 9:30 Abstract - Slides
Marta Villegas “Publishing and Consuming Linked Data. (Lessons learnt when using LOD in an application)”. Wednesday 18th 9:30 Abstract - Slides
Piek Vossen “The Global Wordnet Grid”. Thursday 16th 9:30 Abstract - Slides

Seminars

Asun Gómez-Pérez “Maximising (Re)Usability of Linguistic Resources using Linked Data”. Monday 15th 10:30 Slides
John McCrae “lemon: The Lexicon Model for Ontologies” Monday 15th 14:30 Slides
Felix Sasaki "Roundtripping of NIF based Linguistic Linked Data with non linked data sources“. Tuesday 16th 10:30 Slides
Jorge Gracia “Apetium RDF: an experience in generating linguistic linked open data”. Tuesday 16th 14:30 Slides
Philipp Cimiano “Linked Terminologies: applying linked data principles to terminologies”. Wednesday 17th 10:30 Slides
Gilles Sérasset “The DBnary eco-system, data and APIs”. Wednesday 17th 14:30 Slides
Víctor Rodriguez-Doncel “Rights and licenses for language resources”. Thursday 18th 9:30 Slides

Practical sessions

Monday 15th
         “Introduction to Ontologies, RDF and LD” Jorge Gracia
         “Multilingual LD generation and publishing” Jorge Gracia and Daniel Vila-Suero Slides
         “lemon” John McCrae Slides

Tuesday 16th
         “NIF” Ciro Baron and Bettina Klimek Slides
         “RDF generation with D2RQ” Andrejs Ābele Slides
         “BabelNet and BabelFy” Tiziano Flati

Wednesday 17th
         “lemon-ade” Mariano Rico Slides

Organisers

Jorge Gracia

OEG - Universidad Politécnica de Madrid (Spain)

My name is Jorge Gracia del Río. I am a researcher at Ontology Engineering Group (OEG), Artificial Intelligence Department, Universidad PolitC)cnica de Madrid (UPM), Spain. My main research interests are Semantic Web, Ontology Matching, Multilingual Web of Data, Query Interpretation, Web Intelligence.

For further information, please visit my personal website B;

John Philip McCrae

CITEC - University of Bielefeld, Germany

I am a post-doctoral researcher at the University of Bielefeld in Bielefeld, Germany. I am currently working in Prof. Philipp Cimiano's group, AG Semantic Computing. My research interests include the following: ontologies and the lexicon-ontology interface, collaborative development and publishing of language resources, big data and data science, linked data and the Semantic Web, etc.

For further information, please visit my personal website B;

How to Apply

Pre-requirements

We welcome participants from anywhere in the world and coming from industry or academia. Some basic acquaintance with software development and Web technologies is required. Participants are expected to participate fully in the activities of the datathon until its conclusion.

The number of attendants to this datathon will be limited to 50. Registration is now closed. The cost of the datathon is sponsored by the LIDER project, which includes accommodation and meals of participants. However there will be an administrative fee of 50 euros for registering in the datathon. A limited amount of travelling grants will be available for attendants from less-developed countries who cannot cover their trip with other funds.

You will receive the selection result at mid of March. If you are one of the selected students, you will also receive the payment instructions.

Important Dates

Registration opens: January, 19th 2015
Registration closes: March, 25th 2015 (Previous March, 15th 2015)
Notification: April, 3rd 2015 (Previously March, 23th 2015)
Payment: from May, 10th 2015 to May, 20th 2015
Datathon: June, 15th to 19th 2015

Invited speakers



Tutors



Speakers



Other collaborators

Participants

Report

Here you can find a brief report summarising how the datathon went

Here there are some pictures from Twitter

Venue

The Summer Datathon on Linguistic Linked Open Data (SD-LLOD-15) will be held at Cercedilla (Madrid), which is a small village in the mountains near Madrid.

The event will take place at the Residencia Lucas Olazábal of Universidad Politécnica de Madrid, which is located in Cercedilla, in the forest of the Sierra de Guadarrama, in a place known as Las Dehesas de Cercedilla, which is 50km from Madrid, and 15km from Navacerrada (Directions).

Going to Cercedilla from Madrid-Barajas airport

The nearest airport is Adolfo Suárez Madrid-Barajas. Once you are in Madrid, the best option to go to Cercedilla is by train. The estimated time of the whole trip (from Adolfo Suárez Madrid-Barajas airport to Cercedilla Residence) is around 2 hours.

Taking the train to Cecerdilla

If you reach to Terminals 1, 2 or 3 in Adolfo Suárez Madrid-Barajas Airport
Go to the metro station and take take line 8 (the pink line on the map) to Nuevos Ministerios statin. Leave the metro station and go to the train station (Cercanías Renfe). Then take line C8B to Cercedilla.

If you reach to Terminal 4 in Adolfo Suárez Madrid-Barajas Airport
Go to the train station and take line C1 to Chamartín. Then take line C8B to Cercedilla.

Getting to the Residence From Cercedilla

The train depot in Cercedilla is 1.5 km away from the Residence. You should take a taxi to get to the residence. Here are numbers for Taxis in Cercedilla:
(+34) 61 980 64 52
(+34) 65 026 30 43
(+34) 61 922 62 72

Besides, members of the summer school will be giving direction of how to get a taxi on the train station.

If there is nobody waiting for you at the Cecerdilla train station, please phone the following numbers:
(+34) 91 852 23 78
(+34) 91 858 20 06

Going by car to he Residence

From Madrid, motorway A-6 until the exit El Escorial, Guadarrama. Take direction Guadarrama until crossroads with the old N-6 (it is a crossroads with traffic lights where you can see El Piquio Hotel). Turn left and cross Guadarrama village until Cercedilla indication (it is a road on the right, next to a headquarters). Go straight on road until Cercedilla. When you pass the Cercedilla’s train station, go straight on next crossroads Las Dehesas – La Fuenfría direction. When you arrive to forest information turn right following Residencia Lucas Olazábal UPM direction.

Residencia Lucas OlazC!bal