Half a century of Sri Lanka research: Subjects, researchers, institutions, journals and impact (1973-2019)

Introduction: Bibliometric analyses of research in Sri Lanka, a lower-middle income island nation in South Asia, has focused mainly on medical research, concluding that there is a need for increased research productivity and impact, and for local solutions to health concerns. There has been no general bibliometric analysis across scientific disciplines in the nation, or any study that covers a long period of time to identify general time trends. Objective: To measure and analyse Sri Lanka research by focusing on subjects, authors, institutions, journals and citation for half a century. Methods: We used an advanced search method to extract publications with the word “Sri Lanka” in the SCIEXPANDED, and calculated indicators such as total citations from Web of Science Core Collection since publication year to the end of 2019, citations in 2019, and mean citations per publication. Journal data were taken from 2019 Journal Citation Report. Affiliation re-classification was done to ensure consistency regarding the origin of all publications. Publications were further analysed based on collaboration, and first and correspond-

Ever since Belgian librarian Paul Otlet coined the term 'bibliometrics' in the 1930s, it has grown in popularity as a research method (Rousseau, 2014); it uses statistical methods to measure the quantity, quality and impact of books, articles, and other forms of scientific publication (Durieux & Gevenois, 2010). Bibliometric indicators are often used as criteria for funding, appointments, and promotions of researchers (Durieux & Gevenois, 2010); and, from a broader perspective, bibliometrics are used to compare the scientific performance of countries, journals, research specialities and subject categories (Bah et al., 2019). Therefore, bibliometrics are one technique for depicting scientific research (Fu & Ho, 2013) and are part of the criteria for decisions in the development of science (Lucio-Arias & Leydesdorff, 2009). On the other hand, studies based on bibliometrics have been criticized because databases are biased against researchers publishing in small countries and languages other than English, for example, 95 % of Vietnamese articles are missing from the Science Citation Index because they are written in Vietnamese, and 78 % of Cuban science is also missing because it is published in Cuban journals that are not in that database; it is unfair to say that all that science missing from the Science Citation Index is not of good quality or useful (Añino et al., 2021).
Previous bibliometric studies in this series have evaluated the scientific publication output, trends, research fields, and citations in some databases across countries and continents, including for example Serbia (Ivanović & Ho, 2014), Costa Rica (Monge-Nájera & Ho, 2012), and Taiwan (Chuang & Ho, 2015).
Sri Lanka is a lower-middle income island country in South Asia, with a population of nearly 22 million (Department of Census and Statistics of Sri Lanka, 2020). Successive governments since the turn of the 21 st century have instituted numerous schemes to promote research and development, including increased funding, support, and recognition in the form of research awards, which are expected to increase productivity and impact (Pratheepan & Weerasooriya, 2016).
Even though there are scarce studies about science in the country, it is known that many factors affect the citation of Sri Lanka medical research publications in the databases, including novelty, topic, study design, language, journal and collaborations (Annalingam et al., 2014).
The Science Citation Index fails to include most Sri Lanka medical journals, of which only one was indexed even in PubMed a decade ago (Ranasinghe et al., 2011). At the time, around 60 % of the national research output in medicine was published in non-indexed local journals (Ranasinghe et al., 2012). It is reasonable to assume that authors from other scientific disciplines in Sri Lanka also publish in local non-indexed journals, limiting the accessibility of local research publications to the international readership. A suggested solution was making journals accessible online (Ranasinghe et al., 2011).
Sri Lanka suffers from a rising incidence of both communicable diseases, such as dengue, leptospirosis and tuberculosis; and non-communicable diseases, such as diabetes, cardiovascular disease and malignancies (Gunawardene, 1999). In addition, injuries, including road traffic accidents, are a major cause of hospital admission and mortality (Ranasinghe et al., 2012). The need to address those local problems explains the dominance of health-related research projects, but the Sri Lanka research community needs to improve both productivity and impact according to Ranasinghe et al. (2012). It is likely that similar concerns exist in other disciplines of science in Sri Lanka.
The barriers and challenges to research in this developing nation from the Indian subcontinent are likely to be shared with neighbouring countries with similar socio-economic backgrounds, like India, Pakistan, Bangladesh and Nepal. Therefore, our bibliometric evaluation of research productivity in Sri Lanka can help provide an insight in to the impact, productivity and trends in scientific publications in South Asia, home to nearly quarter of the world's population.
There are so few studies of general science in Sri Lanka, that there are no controversial hypotheses or general trends that can be discussed, so we decided to establish, with this report, a baseline of comprehensive data for the country's science. Our goal here it to assess total scientific performance and impact; most productive institutions and authors; favoured journals, most productive subject categories, and extent of national or international collaboration. We basically report that Sri Lanka is not different from other tropical countries, having a limited scope of research fields and an undesirable dependence on foreign projects.
The main limitation of this study is that it only considers data from the Science Citation Index (expanded version), which is highly biased against countries outside North America and Europe and fails to cover most Sri Lanka publications which nevertheless must be read, cited and generally important for the country and nearby areas.

MATERIALS AND METHODS
Data were generated from the Science Citation Index Expanded online version, the Web of Science Core Collection database, and Clarivate Analytics. We used this, and not another database, because it is very selective and is the base with which we are familiar and for which we have adjusted our long-term research project. We used an advanced search method to extract publications with the word "Sri Lanka" in the country field, to search for publications between 1973 and 2019. The search was performed on 11 August 2020 by one author (Y.S.H.). The SCI-EXPANDED records and the number of citations in each year per publication were coded manually in Microsoft Excel (method details in Li & Ho, 2008). Data were checked by the local authors (P.R. and C.K.L.) to remove incorrect and duplicated data.
To explore the citation rate in a publication, indicators such as C 2019 , TC 2019 , and CPP 2019 (defined below) were used in the present analysis. The number of citations in the Web of Science Core Collection varies with time. Hence, Ho's group proposed TC year (Chuang et al., 2011;Wang et al., 2011), total number of citations in the Web of Science Core Collection since publication year to the end of the most recent year, for example 2019 in this study (TC 2019 ). This indicator makes total citations a constant, which can be repeated and checked (Ho & Fu, 2016). Hence, TC 2019 was defined as the total number of citations since publication by the end of 2019. Citations per publication (CPP 2019 ) was defined as total citations (TC 2019 ) per number of total publications (TP) (CPP 2019 = TC 2019 /TP). In addition, C 2019 , the total number of citations only in 2019, was also evaluated. This measures the influence of an article in the current year (Ho, 2012). The advantage of using TC year and C year is that they ensure the repeatability of results compared to the use of index of citation directly from the Web of Science Core Collection (Fu et al., 2012). Furthermore, it has also been pointed out that it may not be appropriate to use a single indicator to evaluate the impact of an article (Ho & Hartley, 2016). For example, researchers should pay more attention to articles with a high C year and not only to those with high TC year alone, because some highly cited articles of the past with a high TC year may not have had the same high impact in the recent years (Ho & Hartley, 2016).
The impact factors (IF 2019 ) of the journals were taken from the Journal Citation Reports (JCR) which was published in 2019. Affiliation re-classification was done as described below to ensure consistency regarding the origin of all publications, as per recommended standards. Affiliations in England, Scotland, Northern Ireland, Wales, and Anguilla were reclassified as being from the United Kingdom (UK) (Chiu & Ho, 2005). Affiliations in Faroe Islands were reclassified as being from Denmark. Affiliations in French Guiana were reclassified as being from France. Affiliations in Hong Kong was reclassified to be in China (Fu et al., 2012). Affiliations in Fed Rep Ger (Federal Republic of Germany) were reclassified to be in Germany. Affiliations in Yemen Peo Dem R (Democratic Republic of Yemen) were reclassified to be in Yemen. Affiliations in Czechoslovakia were checked and reclassified to be in Czech Republic (Lin & Ho, 2015). Affiliations in Senegambia were checked and reclassified to be in Gambia. Affiliations in USSR and Rep of Georgia were checked and reclassified to be in Georgia (Republic of Georgia). In addition, Affiliations in Austl. were reclassified to be in Australia. Similarly, Czechoslovak Acad Sci (Czechoslovak Academy of Sciences) were checked and reclassified as being the Czech Acad Sci (Czech Republic Academy of Sciences). A potential bias in analysis of institutions occur when authors use different spellings for the same institution. Therefore, we merged these institutions during analysis (Fu et al., 2014). For example, Peradeniya Univ, Univ Peradeniya, Univ Peradeniya & Sri Lanka, Univ Peradeniya Kandy, Univ Peradeniya Sri Lanka, Univ Peradeniyai; articles published under these institutional names were merged together under the University of Peradeniya.
Publications were further analysed based on collaboration, first, and corresponding authorship. Ten types based on the above were evaluated including (Monge-Nájera & Ho, 2017a; Monge-Nájera et al., 2020) ; a) NFR: both first and corresponding authors are not from Sri Lanka, b) NR: corresponding author is not from Sri Lanka, c) NF: first author is not from Sri Lanka, d) IC: international collaboration, e) NC: national collaboration, f) II: institutional independent (single institutional articles), g) CI: Sri Lanka independent (only Sri Lanka authors), h) FP: first author is from Sri Lanka, i) RP: corresponding author is from Sri Lanka, and j) FR: both first and corresponding authors are from Sri Lanka. In the SCI-EXPANDED database, the corresponding author is labelled as reprint author, but in this study, we used the term corresponding author. Similarly, in a single institutional article, the institution was classified as the first as well as the corresponding author institution (Ho, 2014). Furthermore, in a single-author article where authorship is unspecified, the author was considered as both the first and corresponding author (Ho, 2014). Collaboration was evaluated by the affiliations of the authors in a publication, where 'internationally collaborative articles' were those articles co-authored by researchers from different countries other than Sri Lanka, while articles labelled 'national collaborative article' were those with authors from diverse institutions within Sri Lanka.

Document type, language, year of publication and citation impact:
A total of 16 069 publications with at least one author from Sri Lanka in SCI-EXPANDED was found within 19 document types (Appendix 1). The most common document type was articles (N = 12 298; 77 %) followed by the meeting abstracts (N = 2 025; 13 %). The 2 025 meeting abstracts were published in 253 journals, with the top three being Journal of Gastroenterology and Hepatology (N = 161; 8.0 %), Vox Sanguinis (N = 120; 5.9 %), and BJOG-An International Journal of Obstetrics and Gynaecology (N = 104; 5.1 %). Reviews had 1.7 times more citations per publication (CPP 2019 ) than articles (Appendix 1). Three reviews had a TC 2019 of more than 1 000 including the review titled "The CMS experiment at the CERN LHC" (CMS Collaboration, 2008) (TC 2019 = 2 839), followed by "The ecological limits of hydrologic alteration (ELOHA): A new framework for developing regional environmental flow standards" (Poff et al., 2010) (TC 2019 = 1 287); and "Biochar as a sorbent for contaminant management in soil and water: A review" (Ahmad et al., 2014) Corrections had the highest number of authors per publication (APP) of 616 followed by articles with an APP of 116 (Appendix 1). Four articles had more than 5 000 authors; with the article titled "Combinations of singletop-quark production cross-section measurements and vertical bar |f LV V tb |determinations at = 7 and 8 TeV with the ATLAS and CMS experiments" (Aaboud et al., 2019) having the highest number of authors (N = 5 213 authors). Other large group collaborative publications with more than 5 000 authors were "Combined measurement of the higgs boson mass in pp collisions at √s = 7 and 8 TeV with the ATLAS and CMS Experiments" (Aad et al., 2015) (N = 5 154 authors); "Measurements of the Higgs boson production and decay rates and constraints on its couplings from a combined ATLAS and CMS analysis of the LHC pp collision data at = 7 and 8 TeV" (Aad et al., 2016) (N = 5 111 authors); and "Combination of inclusive and differential t charge asymmetry measurements using ATLAS and CMS data at = 7 and 8 TeV" (Aaboud et al., 2018) (N = 5 098 authors). Furthermore, 593 articles had more than 1 000 authors, and these were mainly Only articles were considered for further analysis because they included complete research reports with methods, results, discussions, and conclusions. Almost all articles that reached this particular database (99.9 % of 12 298 articles) were published in English. The remaining non-English articles were published in German (N = 5), French (N = 2), Spanish (N = 2), and Dutch (N = 1).
The earliest article with a Sri Lanka author in SCI-EXPANDED was published in 1973 ( Fig. 1). or more are labelled "classic articles" following Long et al., 2014).

Collaboration patterns: countries and institutions:
Internationally collaborative articles (IC) received higher citations per publication (CPP 2019 ), than national independent articles (II), or national collaborative articles (NC) (Fig. 2). Furthermore, articles with a first author and/or corresponding author from other countries tend to receive much higher CPP 2019 than those with a first author and/or corresponding author from Sri Lanka institutions (Fig. 2). There were 38 % of independent articles and 62 % of articles produced in collaboration with 188 countries; 56 % had a first or a corresponding author from Sri Lanka.
The UK was part of 19 % of collaborative articles, while the USA had the most first author articles (6.9 %) and corresponding author articles (6.4 %) (Appendix 2). Articles with Switzerland as first-author and corresponding author country had the highest CPP 2019 (78 and 84 citations, respectively).

Leading institutions and authors:
In total, 3 076 (25 % of 12 298 articles) were single institution articles (II). There were 75 % interinstitutionally collaborative articles, including 17 % nationally collaborative articles (NC) and 83 % internationally collaborative articles (IC). The University of Peradeniya took the leading position for six publication indicators with 3 248 articles (26 %) which included: 727 institution independent articles, 2 004 inter-institutionally collaborative articles, 1 534 first author articles), 1 498 corresponding author articles), and 194 single-author articles (Appendix 3). The University of Colombo had the highest number of nationally collaborative articles  The category of public, environmental and occupational health, with 193 journals, published the most Sri Lanka articles (829 articles; 6.7 % of 12 298 articles) followed by the environmental sciences (814), and multidisciplinary sciences (725) (Appendix 5). In 2019, the categories Environmental sciences (TP = 814, rank 2 nd ), Multidisciplinary sciences (TP = 725, rank 3 rd ), Particles and fields physics (TP = 513, rank 8 th ), and Public, environmental and occupational health (TP = 829, rank 1 st ) were the top four productive categories (Appendix 6).

Most cited articles:
The "classic" or most cited articles in this database are detailed in Appendix 7, Appendix 8 and Appendix 9; of these, 12 were published in the 2010s and only two in the 2000s. All resulted from smaller Sri Lanka collaborations within large international projects with 2 to 89 participating countries, and Sri Lanka scientists were not first or corresponding authors. Eleven of these articles were published in Lancet, and one each in BioScience, New England Journal of Medicine, Climate Research, and JAMA Oncology.
The earliest classic article was published in 2002 by authors from University of Oxford and University of East Anglia in the UK, and the International Water Management Institute in Sri Lanka, it had a TC 2019 of 1 445 (rank 9 th ) and C 2019 of 76 (rank 50 th ). Nine classic articles ranked among the top 15 in both TC 2019 and C 2019 . In addition, 12 of the 15 classic articles dealt with general and internal medicine, while one article each was published in oncology, biology, environmental sciences, and meteorology and atmospheric sciences. Citation histories for the top classic articles appear in Appendix 7 and Appendix 8.

DISCUSSION
In this first comprehensive bibliometric analysis of Sri Lanka scientific research publications in SCI-EXPANDED, we describe the country's overall research productivity and impact, whilst identifying the most productive research institutions and authors. Furthermore, our results show the importance of collaborative research, especially in international mega-projects, to produce high impact values in this database.
Most Sri Lanka publications in the database were full paper articles, which is consistent with findings from other countries, such as Costa Rica (Monge-Nájera & Ho, 2012), Ghana (Boamah & Ho, 2018) and Brunei . Article dominance generally reflects the fact that authors are encouraged by their institutions to publish them, as opposed to comments, letters, and book reviews, this is done with incentives for career advance and direct, or indirect, financial benefits (Boamah & Ho, 2018). The focus on articles, in turn, may represent a bias because it means that trends affecting other types of publications cannot be identified by our analysis.
The second commonest document type was meeting abstracts, in a proportion that is higher in Sri Lanka than what in Costa Rica (Monge-Nájera & Ho, 2012) and Brunei (5.9 %) , for example, although it is similar to the proportion found in the African country of Ghana (Boamah & Ho, 2018). However, when research is presented as a conference abstract, it does not provide a complete representation of the research methods and findings, and is not fully peer-reviewed. Therefore, authors should receive incentives and support to later publish their meeting abstracts as full articles (we do not have data on the percentage of abstracts subsequently published as articles).
Sri Lanka's articles included in the SCI-EXPANDED were mostly in English, similar to other countries around the world (Bah et al., 2019;Boamah & Ho, 2018;Ho et al., 2018;Monge-Nájera & Ho, 2012). On the island, most people speak Sinhalese or Tamil (Department of Census and Statistics of Sri Lanka, 2020), but, while English is understood by a quarter of the population, the database language bias could hinder academic publishing in Sri Lanka and similar countries, because of a "linguistic injustice" where non-native speakers of English face substantial challenges in the dissemination of scholarly work (Politzer-Ahles et al., 2016).
Sri Lanka was formerly known as Ceylon and only became a republic, adopting its current name in 1972, at the time that we find the earliest article with a Sri Lanka author in the SCI-EXPANDED. From that time onwards, there was only a modest increase until 2002 until reaching a rapid upsurge since 2010. The reasons for this increase are likely to be multifactorial: government policies (Pratheepan & Weerasooriya, 2016) as well as an in the interest of the database itself in covering "third world" countries when growth stopped in industrialized nations.
In recent years, Ho's group proposed a relationship between country publication types (based on collaboration and authorship status) and their citations per publication, to evaluate their impact (Chuang & Ho, 2015). The results show that internationally collaborative articles from Sri Lanka received a higher CPP 2019 , than institutionally independent or nationally collaborative articles; and this reflects the fact that local projects have much smaller budgets and do not have access to the journals of countries that are better covered by the database we used. Furthermore, articles with a first author and/or corresponding author from other countries received a much higher CPP 2019 for the same reason.
Well financed international collaborative projects (1) enable the sharing of new techniques, skills, knowledge and high-end facilities, synergising expertise and producing articles with many authors and wider exposure in large journals, and (2) often study pressing health issues, therefore they are more likely to have more citations and should be encouraged when done on a fair basis (Glänzel, 2001;Khor & Yu, 2016). Conversely, the lowest impact of local authors in this particular database indicates smaller resources and insufficient coverage in this database of journals published in Asia and other less industrialized regions.
Our results also show that publications from Sri Lanka had a higher CPP 2019 than countries with similar socio-economic status, such as Guatemala , Benin (Monge-Nájera et al., 2020), Ecuador (Calahorrano et al., 2020), Ghana (Boamah & Ho, 2018), El Salvador (Monge-Nájera & Ho, 2017b, and Brunei . Possible reasons include a larger presence of Sri Lanka in international megaprojects or more coverage of Asia, over Latin America, in this database (Smith et al., 2014). Additionally, factors affected citation in this database include a) article related factors: quality, novelty, subject area, study topic and study design, b) journal related factors: language, scope and form of publication, and c) author related factors: number, reputation, collaborations and country (Tahamtan et al., 2016). Most of those factors have been also identified in Sri Lanka medical research (Annalingam et al., 2014), and might affect the trend found here. Furthermore, the SCI-EXPANDED has a poor coverage of Sri Lanka journals, and most authors published articles in the Journal of the National Science Foundation of Sri Lanka (Ranasinghe et al., 2011;Ranasinghe et al., 2012).
We found that, in Sri Lanka, public universities were the most prolific institutions, and this was similar to what has been found in other developing countries, including Cameroon (Tchuifon Tchuifon et al., 2017), Costa Rica (Monge-Nájera & Ho, 2012), El Salvador (Monge-Nájera & Ho, 2017b and Nicaragua (Monge-Nájera, & Ho, 2017a). Universities by definition are expected to be institutions of higher learning providing facilities for both teaching and research, where academic scholars/researchers receive recognition, promotion and funding for future research through their publications (Pratheepan & Weerasooriya, 2016). Hence, it is not surprising that Sri Lanka universities are in the forefront of research publications in the country. Even in developed countries like Germany, the world's third largest producer of scientific research, universities have consistently produced two-thirds of the publications in the highest quality journals (Dusdal et al., 2020).
A relationship of percentage of publications in a country and number of journals in each Web of Science category has been proposed (Monge-Nájera & Ho, 2017a; Monge-Nájera & Ho, 2017c), and when we applied it to Sri Lanka, we found that the island published most articles in the category of public, environmental and occupational health. This was similar to Honduras (Monge-Nájera & Ho, 2017a), El Salvador (Monge-Nájera & Ho, 2017b), and Nicaragua (Monge-Nájera & Ho, 2017c) in Central America; and Ghana (Boamah & Ho, 2018) in Africa. However, Brunei  in Asia published the most articles in the category of ecology, showing another trend which is the emphasis on conservation in countries where basic health issues have been controlled to some extent. In the case of Sri Lanka, the health situation (Gunawardene, 1999;Ranasinghe et al., 2012) explains why research efforts are concentrated in health, an area of national priority.
In conclusion, Sri Lanka authors publish mainly in the area of public, environmental and occupational health, reflecting the priorities of the country. The impact of Sri Lanka research articles is highest when published as part of international collaborative project where, unfortunately, they only play a secondary role and the leaders are well-funded researchers from large industrialized countries. Sri Lanka policy makers must find a balance between the two options of focusing support on the most prolific authors and institutions identified in this study; or focusing on emerging researchers and institutions that are less likely to obtain foreign funds. Sri Lanka authors should be encouraged to (1) expand their horizons beyond short-term goals, i.e. researching nonapplied fields that are the basis of all innovation; (2) to strengthen their own journals so that they have better visibility and impact, and (3) to additionally publish more in large international journals as part of teams where they are also among the leaders instead of just accepting secondary roles.
Ethical statement: the authors declare that they all agree with this publication and made significant contributions; that there is no conflict of interest of any kind; and that we followed all pertinent ethical and legal procedures and requirements. All financial sources are fully and clearly stated in the acknowledgements section. A signed document has been filed in the journal archives.

ACKNOWLEDGMENTS
We thank Carolina Seas for her valuable assistance with manuscript preparation.