Job opportunity: CorCenCC project RA

The CorCenCC project team are recruiting an RA (to start from October 2017). For more information about the post, click here.

Future plans, conferences and events (May 2017)

Over the coming weeks, the WP1 team will continue to visit counties across Wales to collect data. This work includes recording at a number of events, from internal meetings, public lectures, choir practices and coffee mornings, to public events and festivals such as Sesiwn Fawr Dolgellau and the Royal Welsh Show’s Spring Festival – not to mention the Urdd Eisteddfod and National Eisteddfod!

But data collection is not the only work going on this summer. CorCenCC team members will be presenting the project’s work at the Corpus Linguistics 2017 international conference as well! Over the five days of the conference, we will be giving four presentations and presenting a poster too, in order to raise awareness of the project’s different work packages amongst other researchers. Details of the papers follow:

  • Piao, S., Rayson, P., Watkins, G., Knight, D. and Donnelly, K. (2017). Towards a Welsh Semantic Tagger: Creating Lexicons for A Resource Poor Language. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Rees, M., Watkins, G., Needs, J., Morris, S., and Knight, D. (2017). Creating a Bespoke Corpus Sampling Frame for a Minoritised Language: CorCenCC, the National Corpus of Contemporary Welsh. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Neale, S., Spasić, I., Needs, J., Watkins, G., Morris, S., Fitzpatrick, T., Marshall, L., and Knight, D. (2017). The CorCenCC Crowdsourcing App: A Bespoke Tool for the User-Driven Creation of the National Corpus of Contemporary Welsh. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Needs, J., Knight, D., Morris, S., Fitzpatrick, T., Thomas, E.M. and Neale, S. (2017). “How will you make sure the material is suitable for children?”: User-informed design of Welsh corpus-based learning/teaching tools. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Knight, D., Fitzpatrick, T., Morris, S., Evas, J., Rayson, P., Spasić, I., Stonelake, M., Thomas, E.M., Neale, S., Needs, J., Piao, S., Rees, M., Watkins, G., Anthony, L., Cobb, T.M., Deuchar, M., Donnelly, K., McCarthy, M. and Scannell, K. (2017). Creating CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh). Paper presented as part of the CMLC-BigNLP2017 National Corpora Poster Track at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.

News (March 2017)

17/02/2017 – CorCenCC Crowdsourcing App launch

To coincide with the launch of the website, February also witnessed the launch of the first release of the CorCenCC crowdsourcing app. The app is currently available on iOS and an Android version will be released within the next two-four months (keep an eye out for that!).

News of the app release was featured on the websites of all partner institutions, on tech websites and in Y Cymro and the Denbighshire Free Press (amongst others). We are hoping that by spreading the word about the app and project, we can raise people’s awareness of the importance and value of the work, and get as many people as possible involved in contributing data and/or using the corpus when it is finally constructed.

28/02/2017 – Project launch

To celebrate a successful first 12 months of the project, the CorCenCC team hosted a launch event at the Pierhead Building in Cardiff Bay. Scaffolded by a weighty media campaign, which included radio interviews on the BBC’s Good Morning Wales programme (PI Dawn Knight) and BBC Radio Cymru’s Post Cyntaf (Ambassador Nia Parry) and print and online press coverage in various outlets (including the BBC and Mail Online, institutional websites and tech blogs, amongst others), the event aimed to act as a springboard for engaging with the public, policy makers, educators, publishers and the media; raising awareness about the project and encouraging individuals to support the work.


The launch, attended by Alun Davies AM, Minister for Lifelong Learning and Welsh Language, gave guests the chance to find out more about the project, which is a collaboration between Cardiff, Swansea, Lancaster and Bangor universities, and is breaking new ground in creating a large-scale, open access corpus of contemporary Welsh language. Backed by high-profile ambassadors poet Damian Walford Davies, musician and presenter Cerys Matthews, broadcaster Nia Parry and international rugby referee Nigel Owens CorCenCC is community-driven and uses mobile and digital technologies to enable public collaboration. A demonstration of our new data collection app which enables Welsh speakers from all walks of life to contribute to the project, was on show at the event. CorCenCC partners and ambassadors also shared their impressions of how the resource will impact on their research, and on the Welsh language community more widely.

Alun and co

Alun Davies, Steve Morris, Dawn Knight, Bethan Jenkins and Tess Fitzpatrick

Minister for Lifelong Learning and the Welsh Language, Alun Davies, said: “I am very pleased to attend the launch of this exciting project today. Not only will this work give us a real record of how Welsh is actually being used, but it will also feed into our aim of developing the role of the Welsh language in technology which will be key if we are to meet our target of a million Welsh speakers by 2050.”


The CorCenCC team

Around 85 people attended the launch and the evening also marked the first time that the majority of the extended CorCenCC team were assembled in the same place together! The launch was sponsored by funds from the British Council, the School of English, Communication and Philosophy at Cardiff University, and the Research Institute for Arts and Humanities at Swansea University – many thanks for your support!


01/03/17 – Whole Project Team meeting

Hot on the heels of the launch event, we held the first Whole Project Team meeting at Cardiff University on St David’s Day. The meeting, which will take place annually, brings together the CorCenCC Project Team (CPT – which comprises the PI, all CIs, RAs and PhD students), Consultants and all members of the Project Advisory Group, and is a great opportunity for the team to get to know each other a little better (face-to-face) and to discuss ideas and future plans. The aim of the meeting was to provide specific work package (WP) updates, to consider and discuss potential routes to engagement for the project as a whole (concentrating on input mainly from the Project Advisory Group) and to think about how we can best push the boundaries in current corpus research with future developments on CorCenCC.


We would like to say a big thank you to all of you who travelled far and wide to attend this meeting – we all thought it was a very successful and engaging meeting and is likely to provide us with an added strength in ideas and motivation to fuel the next steps of development on the project. We are looking forward to having you all back in Cardiff for the meeting in 2018!


CorCenCC newsletter – previous editions

Subscribe to our project newsletter

Enter your e-mail address in the form below then click the ‘Subscribe’ button