We are pleased to announce the following presentations and publications of CorCenCC work (CorCenCC team members in boldface):

Publications:

  • Piao, S., Rayson, P., Knight, D. and Watkins, G. (2018). Towards a Welsh Semantic Annotation System. Proceedings of the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
  • Neale, S., Donnelly, K., Watkins, G. and Knight, D. (2018). Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh. Poster presented at the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
  • Rayson, P. (2018). Increasing Interoperability for Embedding Corpus Annotation Pipelines in Wmatrix and other corpus retrieval tools. Proceedings of the Challenges in the Management of Large Corpora workshop at the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
  • Rayson, P. and Piao, S. (2017). Creating and Validating Multilingual Semantic Representations for Six Languages: Expert versus Non-Expert Crowds. Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications held at the European Chapter of the Association for Computational Linguistics 2017 (EACL) conference, April, Valencia.
  • Piao, S., Rayson, P., Archer, D., Bianchi, F., Dayrell, C., El-Haj, M., Jiménez R-M., Knight, D., Křen, M., Löfberg, L., Nawab, R., Shafi, J., Teh, P-L. and Mudraya, O. (2016). Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages. Proceedings of the LREC (Language Resources Evaluation) 2016 Conference, May 2016, Slovenia.

Presentations:

Invited talks:

  • Knight, D. (2018). Representativeness in CorCenCC: corpus design in minoritised languages. Invited plenary delivered to the JET workshop as part of the French Cognitive Linguistics Association (AFLiCo) conference, 3 – 4 May 2018. Paris, France.
  • Knight, D. (2018). An overview of the CorCenCC Welsh Corpus project. Invited presentation delivered as part of the Applied Linguistics Research Seminar Series, Swansea University, 2nd February 2018.
  • Rayson, P. (2017). Don’t just look at the words: semantic annotation tools for the analysis of academic discourse. Invited talk at the 1st International Conference on Corpus Analysis in Academic Discourse (CAAD), Valencia, Spain.
  • Morris, S. (2017). CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes – ar drywydd y Deg Miliwn. Centre for Welsh and Advanced Celtic Studies seminar programme, Aberystwyth. 23/11/17.
  • Knight, D. (2017). Big Data and Corpus Construction Introducing CorCenCC. Invited seminar presentation at the Investigating (with) Big Data event run by the Cardiff University Digital Humanities Network, 24/5/17, Cardiff University.
  • Knight, D. (2017). Research funding and building networks in the Arts, Humanities and Social Sciences: the case of CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh). Invited seminar presentation as part of the Cardiff School of Journalism, Media and Cultural Studies 2016/17 research seminar series, 5/4/17, Cardiff University.
  • Knight, D. (2017). Constructing corpora of minoritised languages: A focus on CorCenCC. Invited plenary presentation delivered as part of the Corpus Linguistics in the South Conference, 4/3/17, Birkbeck University.
  • Knight, D. (2016). Constructing E-Language Corpora: a focus on CorCenCC (The National Corpus of Contemporary Welsh). Invited plenary presentation at the 4th Computer-Mediated Communication and Social Media Corpora for the Humanities conference, 27-28/9/16, University of Ljubljana, Slovenia.
  • Knight, D. (2016). Innovations in corpus-based research. Invited seminar presentation at the Tokyo Chapter of the Japanese Association of Language Teachers (JALT) meeting, 9/9/16, Tokyo.
  • Knight, D. (2016). The application of corpora: supporting and informing the pedagogic landscape. Invited plenary presentation at the InForm Conference, 16/7/16, Durham University.
  • Knight, D. (2016). Corpora and Pedagogy: developing the community-driven National Corpus of Contemporary Welsh. Invited presentation at the Welsh for Adults annual conference, 8/7/16, Cardiff.
  • Knight, D. (2016). The National Corpus of Contemporary Welsh: A community driven approach to linguistic corpus construction. Invited presentation at the UCREL Corpus Research Seminar Series, 9/6/16, Lancaster University.

Conference presentations:

  • Knight, D. Morris, S. and Fitzpatrick, T. (2018). From vision to reality: reflections on securing and managing large research projects. Paper to be presented at the BAAL (British Association for Applied Linguistics) 2018 conference, York St. John University, UK.
  • Knight, D., Morris, S., Fitzpatrick, T., Morris, J., Rayson, P., Spasić, I., Thomas, E.M., Neale, S., Needs, J., Piao, S., Rees, M. and Williams, L. (2018). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – National Corpus of Contemporary Welsh): A demonstration. Paper to be presented at the BAAL (British Association for Applied Linguistics) 2018 conference, York St. John University, UK.
  • Knight, D. (2018). Corpus Design and Construction: the challenges faced by minoritized language. Paper presented at the IVACS (Inter-Varietal Applied Corpus Studies) 2018 conference, Valletta, Malta.
  • Morris, S., Knight, D. and Fitzpatrick, T. (2018). CorCenCC: applying the sociolinguistics of new speakers within a contemporary corpus of Welsh. Paper presented at the IVACS (Inter-Varietal Applied Corpus Studies) 2018 conference, Valletta, Malta.
  • Piao, S., Rayson, P., Knight, D. and Watkins, G. (2018). Towards a Welsh Semantic Annotation System. Paper presented at the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
  • Rayson, P. (2018). Increasing Interoperability for Embedding Corpus Annotation Pipelines in Wmatrix and other corpus retrieval tools. Paper presented as part of the Challenges in the Management of Large Corpora workshop at the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
  • Neale, S., Donnelly, K., Watkins, G. and Knight, D. (2018). Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh. Paper presented as part of the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
  • Rees, M., Needs, J., Williams, L., Morris, S. and Knight, D. (2018). My Welsh is rubbish: Corpus data collection in a lesser-used language context – some challenges. Paper presented at the IVACS (Inter-Varietal and Applied Corpus Studies) one-day symposium on Corpus Research in Challenging Contexts, February 2018, Maynooth University.
  • Needs, J., Rees, M., Williams, L., Morris, S. and Knight, D. (2018). Representing contemporary Welsh – who speaks it, where, when and how:
    Designing, collecting and transcribing CorCenCC’s spoken component. Paper presented at the IVACS (Inter-Varietal and Applied Corpus Studies) one-day symposium on Corpus Research in Challenging Contexts, February 2018, Maynooth University.
  • Rayson, P. and Piao, S. (2017). Creating and Validating Multilingual Semantic Representations for Six Languages: Expert versus Non-Expert Crowds. Paper presented at the 1st Workshop on Sense, Concept and Entity Representations and their Applications held at the European Chapter of the Association for Computational Linguistics 2017 (EACL) conference, April, Valencia. pp.61-71.
  • Knight, D., Fitzpatrick, T. and Morris, S. (2017). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh): An overview. Paper presented as part of the annual British Association for Applied Linguistics (BAAL) conference, September 2017, University of Leeds.
  • Morris, S., Fitzpatrick, T. and Knight, D. (2017). Creating pedagogic wordlists in an under-resourced language. Poster presented as part of the annual British Association for Applied Linguistics (BAAL) conference, September 2017, University of Leeds.
  • Piao, S., Rayson, P., Watkins, G., Knight, D. and Donnelly, K. (2017). Towards a Welsh Semantic Tagger: Creating Lexicons for A Resource Poor Language. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Rees, M., Watkins, G., Needs, J., Morris, S., and Knight, D. (2017). Creating a Bespoke Corpus Sampling Frame for a Minoritised Language: CorCenCC, the National Corpus of Contemporary Welsh. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Neale, S., Spasić, I., Needs, J., Watkins, G., Morris, S., Fitzpatrick, T., Marshall, L., and Knight, D. (2017). The CorCenCC Crowdsourcing App: A Bespoke Tool for the User-Driven Creation of the National Corpus of Contemporary Welsh. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Needs, J., Knight, D., Morris, S., Fitzpatrick, T., Thomas, E.M. and Neale, S. (2017). “How will you make sure the material is suitable for children?”: User-informed design of Welsh corpus-based learning/teaching tools. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Knight, D., Fitzpatrick, T., Morris, S., Evas, J., Rayson, P., Spasić, I., Stonelake, M., Thomas, E.M., Neale, S., Needs, J., Piao, S., Rees, M., Watkins, G., Anthony, L., Cobb, T.M., Deuchar, M., Donnelly, K., McCarthy, M. and Scannell, K. (2017). Creating CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh). Paper presented as part of the CMLC-BigNLP2017 National Corpora Poster Track at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
  • Knight, D., Morris, S., Fitzpatrick, T. and Anthony, L. (2016). Charting the vocabulary of a minoritised language: Challenges and opportunities in the creation and application of the National Corpus of Contemporary Welsh. Paper presented at the Vocab@Tokyo international conference, September 2016, Tokyo, Japan.
  • Fitzpatrick, T., Knight, D. and Morris, S. (2016). Creating pedagogical wordlists: a comparison of thematic and corpus approaches. Poster presented at the Pacific Second Language Research Forum (PacSLRF2016), September 2016, Tokyo, Japan.
  • Knight, D., Fitzpatrick, T. and Morris, S. (2016). CorCenCC – Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh). WISERD (Wales Institute of Social and Economic Research, Data and Methods), July 2016, Swansea University.
  • Knight, D., Neale, S., Watkins, G., Spasić, I., Morris, S. and Fitzpatrick, T. (2016). Crowdsourcing corpus construction: contextualizing plans for CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh). Paper presented at the IVACS 2016 conference, June 2016, Bath Spa University.
  • Needs, J., Rees, M., Watkins, G., Morris, S., Knight, D. and Fitzpatrick, T. (2016). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh): Challenges and applications in a minoritised language context. Paper presented at the IVACS 2016 conference, June 2016, Bath Spa University.
  • Piao, S., Rayson, P., Archer, D., Bianchi, F., Dayrell, C., El-Haj, M., Jiménez R-M., Knight, D., Křen, M., Löfberg, L., Nawab, R., Shafi, J., Teh, P-L. and Mudraya, O. (2016). Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages. Paper delivered at the LREC (Language Resources Evaluation) 2016 Conference, May 2016, Slovenia.