Funded Projects

Project Title Large-Scale capture of Producer-Defined Musical Semantics.
Principal Investigator Cham Athwal
Project Partners School of Digital Media Technology (Birmingham City University), Centre for Digital Music (Queen Mary University of London), Birmingham Conservatoire (Birmingham City University)
Summary The study is motivated by the lack of transferable semantic descriptors in music production and the requirement for more intuitive control of low-level parameters, thus providing musicians with easier access to technology. We aim to overcome this problem by evaluating large amounts of labelled data taken from within the digital audio workstation. The main novelty that will be introduced by the project is a model for the estimation of perceptually accurate descriptors based on a large corpus of semantically annotated music production data. The outcome of the mini-project will be the identification of an appropriate methodology for the capture of this semantic data.
Report Presentation given at "Semantic Media @ British Library" event 2013.
Ryan Stables, Assisted Parameter Modulation in Music Production using Large-Scale Producer-Defined Semantics, Innovation In Music, York, England, 2013. Full source code available on Github. (Follow-up paper in preparation).

Project Title Semantic 
 in Early 
Principal Investigator Tim 
Project Partners Goldsmiths 
College, City
 University, University of Oxford, BBC.
Summary Linking data from various sources via metadata and/or content is a vital task in musicology and library cataloguing, where semantic annotations play an essential role. This innovative pilot project will work with data in ECOLM of two types: (a) encoded scores OCR'd from 16-c printed music; (b) expert metadata from British Library cataloguers. We’ll build on existing ontologies such as the Music Ontology, introducing key concepts embedded in our historical text and music images (e.g., place- and person names, dates, music- titles and lyrics) and prepare the ground for a new ontology for melodic, harmonic and rhythmic sequences. 16-c printed texts vary a lot in quality, spelling, languages, fonts and layouts, so support for approximate matching, e.g. using the Similarity Ontology, is vital for human control and interaction in cataloguing and retrieval of historical music documents. The project will produce an online demonstrator to show the principles in action, serving as a multidisciplinary pilot application of Linked Data in the study of early music which will be widely applicable for scholarship in other musical and historical repertories.
Report Presentation given at Digital Music Research Network (DMRN) 2013.
Presentation given at "Semantic Media @ British Library" event 2013.
Article presenting project findings: Crawford, Tim, Fields, Ben, Lewis, David and Page, Kevin R.. 2014. Explorations in Linked Data practice for early music corpora. In: Digital Libraries 2014.

Project Title Tawny Overtone
Principal Investigator Phillip Lord
Project Partners University of Newcastle, University of Manchester
Summary This is highly speculative work, and will enable us to understand whether we
can compose and orchestrate metadata alongside the music; it will push the
boundaries of integration of music and semantics.
We propose to investigate born-semantic music, where semantic annotation can
be added at any point in the production of the music. For this, we will
combine Overtone and Tawny-OWL. The former is an electronic music system that
allows the description and synthesis of music at all levels: from the quality
of the sounds, to notes and rhythm, to song or composition level. The latter
allows a generation of semantic annotation in OWL. Crucially, these use the
same underlying syntax and language. This should allow annotations at any
level to percolate; so, for example, if a musician creates a drum sound, their
role as a contributor should percolate through to any piece of music using
that sound automatically. Likewise, richer annotation such as mood, pace,
style should percolate.
Report Publication of a report or article is imminent. Blog entries describing the intermediate steps: Project Description, Music Ontology Representation in Tawny-OWL. Final report.

Project Title SemanticNews: enriching publishing of news stories
Principal Investigator Jonathon Hare
Project Partners University of Southampton, University of Sheffield, BBC
Summary This project aims to promote people's comprehension and assimilation of news by augmenting live broadcast news articles with information from the SW in the form of linked open data (LOD). We plan to lay foundations for a toolkit for real-time automatic provision of semantic analysis and contextualization of news, encompassing state of the art SW technologies including text mining, consolidation against LOD, and advanced visualisation. To bootstrap our work, we will use television news articles that already have transcripts. Using these we will create a workflow that will a) extract relevant entities using established named entity recognition techniques to identify the types of information to contextualise for a news article; b) provide associations with concepts from LOD resources; c) visualise the context will using maps to provide the viewers with geographical information, and graphs derived from the LOD cloud. E.g. a political party has different levels of support across the country; this can be visualised by maps and graphs. The project's outcomes will be evaluated in a user study, which will provide feedback regarding toolkit quality and usability, and direct our activities in and beyond the scope of the proposal.

Project Title Computational Analysis of the Live Music Archive (CALMA)
Principal Investigator Simon Dixon
Project Partners University of Manchester, Queen Mary University of London, Oxford e-Research Centre, University of Oxford, The Internet Archive
Summary The objective of the "Computational Analysis of the Live Music Archive" (CALMA) project is to facilitate scholarship related to live music in the areas of music informatics, popular musicology and music information retrieval. The project will develop a Linked Data service exposing substantial data about live music, including core and contextual metadata linked with existing popular Semantic Web resources, as well as the output of content-based analyses (tempo, key, etc.) of audio recordings. The outcomes will be evaluated using exemplar research questions.

Project Title MUSIC - Metadata Used in Semantic Indexes and Charts
Principal Investigator Simon Dixon
Project Partners University of Northampton, Queen Mary University of London, Academic Rights Press
Summary The objective of the "Metadata Used in Semantic Indexes and Charts" (MUSIC) project is to facilitate musicological research in the area of popular music. Emerging Linked Data technologies enable the combination of several music related data sources published openly on the Semantic Web. Academic Rights Press provides an extensive database of popular music charts, already linked to academic publications: Academic Charts Online (ACO). Fusing these resources will facilitate innovative, to date unprecedented ways of navigating through the popular music space, enabling novel research to be carried out. The integration of resources and the provision of an easy to use interface present several challenges requiring disparate skills, interdisciplinary collaboration, and small scale funding difficult to obtain otherwise. These challenges include the effective fusion of Semantic Web resources with data and analytical tools provided by ACO, metadata alignment in different data repositories, testing and improving large-­-scale data integration technologies, and providing an interface relevant to researchers and students working in popular musicology. The project will thus rely on, and bring value to multiple disciplines including musicology, Linked Data and the Semantic Web, user interface design, software development, and the broader fields of music informatics and pedagogy.

Article presenting project findings: Mora-McGinity, Ogilive, Fazekas: "Creating Semantic Links between Research Articles and Music Artists"

Article presenting project findings: Mora-McGinity, Ogilive, Fazekas: "Semantically Linking Humanities Research Articles and Music Artists"

Project Title WhatTheySaid
Principal Investigator Mike Wald
Project Partners University of Southampton, University College London, BBC
Summary The TV programmes, such as new reports, politics discussion programme and interviews, produced by mass media exerts tremendous influence on the transparency of politics in the UK. Political figures need to be responsible for what they have said to public media as they will be monitored by the public. However, it is still difficult currently to automatically analyse recording archives to find the answer to questions such as: did some political figure made a promise some time ago, that he did not meet later; did someone in the government refer to a figure that was actually wrong? This project is aiming to develop a framework using natural language processing and machine learning to automatically extract key concepts from the speech statements and categorize them for searching, viewing and comparison. We will also provide data visualisation along the timeline, where the statements will be visualised together with the speaker, the audio-visual record and related context from the Linked Data Cloud, so that users can easily search, view and compare the statements the political figures have made. To bootstrap our work, we will obtain archives of politics interview programmes with transcript segmented by speakers from BBC, the most influential broadcasting company in UK. We can use natural language processing tools to analysis the transcripts, extract important concepts (semantic annotations) from statements they have made and categorize them by the key concepts, such as law, economy, foreign affairs, NHS, migration, etc. Furthermore, using linked data, each important statement and semantic concepts in the programme will be linked to a fragment of the video archive. Then users can search by speakers, categories and plain text, and watch the video fragments as the proof the statement. With the help of video metadata, we can also visualise the statements and media fragments along the real-world timeline. Similar to the demo of TimelineJS (, users can easily navigate through the timeline and spot whether some government or politic figures' statements made in different times are inconsistent.
Report Related Publication: Li, Yunjia, Chaohai Ding, and Mike Wald. "WhatTheySaid: Enriching UK Parliament Debates with Semantic Web.", Proceedings ISWC, 2014..

Project Title A 'second screen' music discovery and recommendation service based on social-cultural factors
Principal Investigator Panos Kudumakis
Project Partners Queen Mary University of London
Summary Viewers watching TV may would like to use their tablet or smart phone as a 'second screen', firstly to identify any music playing on the TV, and then secondly to discover more information about it. Thus, the microphone of the 'second screen' device is used to listen to the music playing on the TV, whilst audio fingerprinting technology is used to identify it. Then, a dynamically webpage is generated providing rich information about the music identified, as well as related music and musical artists based on social-cultural factors. The latter is achieved by querying web services such as Youtube, The Echonest, and MusicBrainz. Linking and making sense - knowledge inference - out of such wide range and diverse music-related data acquired across multiple sources and services on the web is achieved thanks to C4DM Music Ontology. An Android app acting as a 'second screen' is currently available for demonstration purposes.
Report Open Source Code.

Project Title Design of a Semantic Web Ontology for the PRAISE Practice Agent Architecture
Principal Investigator
Project Partners Goldsmiths, Queen Mary University of London, Artificial Intelligence Research Institute (IIIA), Universitat Autònoma de Barcelona
Summary In this project we designed a Semantic Web Ontology according to the PRAISE Practice Agent Architecture specification using the OWL 2 Web Ontology language. Semantic Web technologies allows for the structured representation of data that can be shared across agents, and can be queried using powerful RDF query languages such as SPARQL. The ontology covers different forms of feedback such as praise and criticism, including sub types: constructive, descriptive and evaluative. It covers arrangements of and between people and agents, such as community, peer and teacher. It defines a list of all standard tasks a user can carry out within the PRAISE platform, e.g. record, listen, annotate, share. While the PRAISE specification defines several domain-specific concepts, we also extensively reuse existing ontologies in our design, such as the Music Ontology and Audio Features Ontology.

Project Title POWkist - Visualising Cultural Heritage Linked Datasets
Principal Investigator Chris Mellish
Project Partners University of Aberdeen, Northumbria University
Summary The POWKist project aims to use semantic technologies to support visualisation of combined linked datasets in the cultural heritage domain. This is to provide systematic and attractive visualisation of cultural heritage linked dataset and bring raw data closer to citizen-historians for more efficient exploitation. POWkist will cover the whole life-cycle of content from data collection to data consumption by citizen-historians and the general public.
Report Final Report. Demo Website.

Paper: Kay, A. : "POWKist: Visualising Cultural Heritage Linked Datasets" (Connected Communities Heritage Network Sumposium)

Project Title Semantic Linking of BBC Radio (SLoBR) - Programme Data and Early Music
Principal Investigator Kevin Page
Project Partners University of Oxford, BBC, Goldsmiths College, City University
Summary Semantic Linking of BBC Radio (SLoBR) addresses a further crucial step in applying Linked Data (LD) as an end-to-end solution for the music domain. Previous efforts, including the successful SLICKMEM project, primarily generated data and links from and between academic and commercial sources; SLoBR focuses on the use and consumption of such LD and development of tooling to support these applications.
Report Paper: DM Weigl, KR Page, D Lewis, T Crawford, I Knopke: "Unified access to media industry and academic datasets: a case study in early music" (ISMIR)

Paper: DM Weigl, D Lewis, T Crawford, KR Page: Expert-guided semantic linking of music-library metadata for study and reuse (International Workshop on Digital Libraries for Musicology)

Paper: T Crawford, B Fields, D Lewis, K Page : "Explorations in Linked Data practice for early music corpora" (JCDL)

Project Title An  Argument  Workbench - extracting  structured  arguments  from  social media
Principal Investigator Adam  Wyner
Project Partners University of Aberdeen, University of Sheffield
Summary  Reader-contributed comments to a news article are a source of arguments for and against issues raised in the article, where an argument is a claim with justifications and exceptions. For example, commenting about an article on the Crimea, a reader states that Russia's behaviour is unacceptable, giving justifications; another reader criticises the justifications; and so on. It is difficult to coherently understand the overall, integrated meaning of the comments. Consequently, the "wisdom of crowds" is lost. Difficulties arise because comments are: in high volume, updated, presented in a list so distributing ideas, represented in language, not machine-readable, and miss indicators for relationships amongst comments. While argument visualisation tools help people to understand media derived arguments, the visualisations are manually reconstructed, thus expensive to produce in terms of time, money, and knowledge. Current automatic text mining techniques, e.g. sentiment analysis and named entity/relation extraction, miss the argument structure. Furthermore, arguments cannot be automatically reprocessed. To reconstruct the arguments sensibly and reusably, we propose a novel argumentation workbench, which is a semi-automated, interactive, integrated, modular tool set to extract, reconstruct, and visualise arguments. The intention is to present the arguments in a clearer, organised form, not to judge or filter out alternative viewpoints. The workbench integrates well-developed, published, state-of-the-art tools in information retrieval and extraction, visualisation, and computational approaches to abstract and instantiated argumentation. These techniques will identify the higher-level structures of meaning found in argumentation and reasoning. The workbench will be the basis for further theoretical and applied work.