21 dic 2016

Does Dirty Data Affect Google Scholar Citations? The case of the academic profiles of 11 Turkish researchers

G Doğan, I Şencan, Y Tonta
Does dirty data affect google scholar citations?
ASIST '16.  Proceedings of the 79th ASIS&T Annual Meeting: Creating Knowledge, Enhancing Lives through Information & Technology. Copenhagen, Denmark, October 14-18 2016

OBJECTIVES


The main goal of this study is to find out if Google Scholar citation metrics fluctuate on the basis of presence of duplicate publications and citations in the database. Are addressed the following research question: 


- Does Google Scholar database include duplicate publications and citations in researchers’ profiles?

- If yes, what is the impact of this practice on citation counts and Google Scholar Citations metrics such as h- and i10-index values? 
Answering this question will shed some light on the size of the problem and help us better interpret the rankings and metrics based on GS data.



METHODS
Are selected the 11 researchers based at Hacettepe University’s Department of Information Management with public GS profiles (January 27, 2016. Are collected and cleaned data between January 27-March 18, 2016.
Are checked Google Scholar profiles of 11 researchers to identify duplicate records for the same publications. Next, are identified the number of different records for each publication and citations thereto as well as singular publication counts for each researcher and combined citation counts for each publication. Are then re-calculated the h- and i10-indexes for each researcher using their new publication and combined citation counts and compared them with Google Scholar Citations metrics

RESULTS

Duplicate Publications
- 14% (n=69) of publications (N=499) were represented with more than one records (mostly 2, max. 5)
- Excluding duplicate records did not reduce the number of citations (only 4 out of 69 publications got affected)
- None of the researchers’ re-calculated h-index was changed and only one researcher’s i10-index has increased by 1
Duplicate Citations
- 135 publications (55%) received a total 364 duplicate citations: 12% of all citations (3,079)
- When duplicate citations removed, citation counts of half of 135 publications decreased by at least two citations
- Citation counts of almost all researchers decreased, some as much as by 20%
- h-indexes of more than half the researchers decreased by at least 1
- i10-indexes of four researchers decreased by 2 and 4, although one researcher’s i10-index increased by 1
Discussion

Confirming our hypothesis

We can not generalize. The sample is small and skewed (11 Turkish researchers). National, linguistic and disciplinary peculiarities.
Further studies are needed and with larger and more representative samples

Available at
http://yunus.hacettepe.edu.tr/~tonta/yayinlar/ASIST2016_poster.pdf

20 dic 2016

H-index manipulation by merging articles in Google Scholar Profiles: Models, theory, and experiments

R van Bevern, C Komusiewicz, R Niedermeierd, M Sorged, T Walsh
H-index manipulation by merging articles: Models, theory, and experiments
Artificial Intelligence 2016, 240: 9–35
doi.org/10.1016/j.artint.2016.08.001

The H-index is a widely used measure for estimating the productivity and impact of researchers, journals, and institutions. Several publicly accessible databases such as AMiner, Google Scholar, Scopus, and Web of Science compute the H-index of researchers. Such metrics are therefore visible to hiring committees and funding agencies when comparing researchers and proposals. 

Although the H-index of Google Scholar profiles is computed automatically, profile owners can still affect their H-index by merging articles in their profile. The intention of providing the option to merge articles is to enable researchers to identify different versions of the same article. This may decrease a researcher’s H-index if both articles counted towards it before merging, or increase the H-index since the merged article may have more citations than each of the individual articles. Since the Google Scholar interface permits to merge arbitrary pairs of articles, this leaves the H-index of Google Scholar profiles vulnerable to manipulation by insincere authors.

Results:

1. We propose two further ways of measuring the number of citations of a merged article. One of them seems to be the measure used by Google Scholar.

2. We propose a model for restricting the set of allowed merge operations. Although Google Scholar allows merges be-tween arbitrary articles, such a restriction is well motivated: An insincere author may try to merge only similar articles in order to conceal the manipulation.

3. We consider the variant of H-index manipulation in which only a limited number of merges may be applied in order to achieve a desired H-index. This is again motivated by the fact that an insincere author may try to conceal the manipulation by performing only few changes to her or his own profile.

4. We analyze each problem variant presented here within the framework of parameterized computational complexity. That is, we identify parametersp—properties of the input measured in integers—and aim to design fixed-parameter algorithms, which have running timef(p) ·nO(1)for a computable functionfindependent of the input sizen. In some cases, this allows us to give efficient algorithms for realistic problem instances despite the NP-hardness of the problems in general. We also show parameters that presumably cannot lead to fixed-parameter algorithms by showing some problem variants to be W[1]-hardfor these parameters.


5. We evaluate our theoretical findings by performing experiments with real-world data based on the publication profiles of AIresearchers. In particular, we use profiles of some young and up-and-coming researchers from the 2011 and 2013 editions of the IEEE “AI’s 10 to watch” list.

Available
 http://www.sciencedirect.com/science/article/pii/S0004370216300844

30 nov 2016

Microsoft Academic: is the Phoenix getting wings?

A DIGEST OF

Harzing, A-W & Alakangas, S. 
Microsoft Academic: is the Phoenix getting wings?
Scientometrics(in press)


OBJECTIVES
1. To compare publication and citation coverage of the new Microsoft Academic with Google Scholar, Scopus, and the Web of Science.
2. To investigate the extent to which the findings change if using the more liberal “estimated citation count” in Microsoft Academic rather than the more conservative “linked citation count”
METHOD
Sample
145 Associate Professors and Full Professors at the University of Melbourne, Australia in 37 disciplines grouped into 5 broad disciplinary areas: Life Sciences, Sciences, Engineering, Social Sciences, and Humanities 
Design
For each researcher, 4 indicators were calculated from 4 different sources.
- Indicators: Number of papers; Citations received; h-index; hla-index
- Sources: Google Scholar; Microsoft Academic; Scopus; Web of Science
Publish or Perish (PoP) was used to conduct searches for Google Scholar and Microsoft Academic.
Period analyzed
All time. Data collected in the first week of October 2016.
RESULTS
Microsoft Academic coverage has improved substantially (an average growth of nearly 10%). The biggest increase is found for several books or book chapters, as well as some publications in minor journals.
In terms of data quality, namely several erroneous year allocations, and citations that were split between a version of the publication with the main title only and a version with both the main title and a sub-title – have not yet been resolved.
On average Microsoft Academic citations, are very similar to Scopus and Web of Science citations and substantively lower only than Google Scholar citations.
As to disciplines, Microsoft Academic has fewer citations than Scopus and, marginally, than Web of Science for the Life Sciences and Sciences. In the Social Sciences, however, Microsoft Academic has a clear advantage over both Scopus and Web of Science, providing 1.5 to 2 times as many citations for the sample. The difference is even starker for the Humanities, where Microsoft Academic has a coverage that is 1.7 to nearly 3 times as high.
Google Scholar citations were higher than Microsoft Academic citations for all but one individual in the sample
FIG 1. Average number of papers and citations for 145 academics across Google Scholar, Microsoft Academic, Scopus and Web of Science
  
FIG 2. Average citations for 145 academics across Google Scholar, Microsoft Academic, Scopus and Web of Science, grouped by five major disciplinary areas

Taking Microsoft Academic estimated citation counts rather than linked citation counts as our basis for the comparison with Scopus, Web of Science, and Google Scholar does change the comparative picture quite dramatically.
Looking at our overall sample of 145 academics, Microsoft Academic’s average estimated citation counts (3873) are much higher than both Scopus (2413) and Web of Science (2168) citation counts, and very similar to Google Scholar’s average counts (3982).
For the Life Sciences Microsoft Academic estimated citation counts are in fact 12% higher than Google Scholar counts, whereas for the Sciences they are almost identical.
FIG 3. Comparison of average Microsoft Academic estimated citation counts with Google Scholar citation counts, grouped by five major disciplinary areas

CONCLUSIONS

The study suggests that the new incarnation of Microsoft Academic presents with an excellent alternative for citation analysis. Moreover, the comparison of citation growth over the last 6 months also suggests that Microsoft Academic is still increasing its coverage.

FINAL REMARKS

To the best of our knowledge, this work represents one of the first empirical analyses concerned to tackle with the estimated citation counts, a procedure followed by the new generation of academic search engines.

This constitutes a paramount shift in citation analyses as it manages estimated citations instead of real citations. In this sense, we would like to emphasize the following results provided:

a) When using the more liberal estimated citation counts for Microsoft Academic its average citations counts were higher than both Scopus and the Web of Science for all disciplines.

b) For the Life Sciences, Microsoft Academic estimated citation counts are even higher than Google Scholar counts, whereas for the Sciences they are almost identical.

c) For Engineering, Microsoft Academic estimated citation counts are 14% lower than Google Scholar citations, whereas for the Social Sciences this is 23%. Only for the Humanities are they substantially (69%) lower than Google Scholar citations.

When comparing Academic Search with Google Scholar, we need to take into account that Google Scholar does not work – at least currently – with estimated citations. Therefore, the more fair comparison should be with the so-called Microsoft Academic’s conservative citation counts. However, the similar results found in some disciplines may be a signal of accuracy in the estimation processes, which in turn may change the way in which academic search engines will work in the future.

Faced with this scenario, we may ask a question in the wind… do we need to estimate when we can gather everything?

24 nov 2016

Impact of Macedonian Biomedical Journals in Google Scholar

Mirko Spiroski
Current Scientific Impact of Macedonian Biomedical Journals (2016) in Google Scholar Database Analysed with the SoftwarePublish or Perish
Maced Med Electr J, 2016 May 30; 50022
dx.doi.org/10.3889/mmej.2016.50022

OBJECTIVES
The aims of this paper are: to analyze Macedonian biomedical journals in the Google Scholar database with the software Publish or Perish, to present their current scientific impact, to rank the journals, and to advice the authors about the possibilities where to publish their papers.

METHODS
Biomedical journals in the Republic of Macedonia included in Macedonian Association of Medical Editors (MAME) are analyzed. The results are obtained with the software Publish or Perish which analyze publicly available scholarly papers in the Google Scholar database (May 11, 2016).

RESULTS
From 38 journals only 25 has indexed one or more papers in the Google Scholar database (Table 1). The rest of 13 journals are without any paper indexed in this base and are not included in this investigation. The biggest number of citations in the Google Scholar database have the journals Prilozi - MANU, 0351-3254 (1622); Macedonian Journal of Medical Sciences, 1857-5773 (838) and Macedonian Journal of Chemistry and Chemical Engineering, 1857-5552 (705)
                                           Table 1
Only 18 journals received citations on Google Scholar(Table 2) 
Table 2

Available at

8 nov 2016

The Google Scholar revolution: opening the academic Pandora's box

Enrique Orduna-Malea, Alberto Martín-Martín, Juan M. Ayllón, Emilio Delgado López-Cózar

La revolución Google Scholar: 
Destapando la caja de Pandora académica
Editorial Universidad de Granada; 
UNE. Unión de Editoriales Universitarias Españolas
ISBN 978-84-338-5941-9 (print); ISBN 978-84-338-5985-3 (PDF)

Google Scholar brought a true revolution in the way scientific information is searched, found, and accessed. The simple search box, already a de facto standard for searching information, takes users to numerous and relevant results including all document types, written in any language, and published in any country. Using this search engine is easy, quick, and free. However, what started as a search engine has unintentionally become a very valuable source of data for research evaluation, with bibliometric applications. That is, a true "academic Pandora's box".

The goal of this book is, mainly, to provide an in-depth description of all the characteristics and features of Google Scholar. Its origin and evolution are described, and the way it works, its size, coverage, and growth is analysed thoroughly. All its functionalities are delineated, and its strengths, weaknesses, and dangers are discussed.


A comprehensive review of all the scientific literature dealing empirically with the behaviour of this platform is carried out.

Additionally, this work also discusses the main products that have been derived from the search engine. Google Scholar Citations, born as a author profile service in which researchers could generate a bibliographic profile to display their documents, as well as the number of times they have been cited, including various bibliometric indicators. Google Scholar Metrics is also discussed. This product was created as a ranking of the most influential scientific publications sorted by their h index. Apart from journals, it includes some conferences and repositories also covered by Google Scholar.

Third party products created by independent researchers are also analysed. All these products make use of Google Scholar as a source of data for bibliometric analyses: Publish or Perish, Scholarometer, H Index Scholar, Journal Scholar Metrics, Publishers Scholar Metrics, Proceedings Scholar Metrics y Scholar Mirrors.


In short, this books offers an all-inclusive and meticulous view of Google Scholar and its related products. It is the result of the intense research work carried out by the authors during almost a decade at Universidad de Granada and Universidad Politécnica de Valencia where they work as professors and researchers.

Available from 

28 oct 2016

Journal Rankings in Sociology: Using the H Index with Google Scholar

Jerry A. Jacobs
Journal Rankings in Sociology: Using the H Index
with Google Scholar
The American Sociologist, 2016, 47(2): 192–224
DOI 10.1007/s12108-015-9292-7

Background
There is considerable interest in the ranking of journals, given the intense pressure to place articles in the Btop^ journals. In this article, a new index, h, and a new source of data—Google Scholar – are introduced, and a number of advantages of this methodology to assessing journals are noted. This approach is attractive because it provides a more robust account of the scholarly enterprise than do the standard Journal Citation Reports. Readily available software enables do-it-yourself assessments of journals, including those not otherwise covered, and enable the journal selection process to become a research endeavor that identifies particular articles of interest. While some critics are skeptical about the visibility and impact of sociological research, the evidence presented here indicates that most  sociology journals produce a steady stream of papers that garner considerable attention.

Methods
The analysis covered 120 sociology journals for the period 2000–2009, and 140 journals for the period 2010–2014. I started with the list of 99 journals included in the Web of Science sociology subject category in 2010, when research on this project began. In several cases, the classification of these publications as academic sociology journals may be questioned on the grounds of subject matter (eg., Cornell Hospitality Quarterly) or because of the publication’s explicit interdisciplinary orientation (Social Science Research, Population and Development Review). I included these journals on the grounds of both inclusiveness and comparability.
Data for the bibliometric analysis in this article are retrieved in the Google Scholar database, which could be obtained and extracted with the assistance of the Publish or Perish software 

Results
(Table 1 reports several measures of the visibility of 120 sociology journals. The proposed measure h, calculated over the period 2000–2009, is provided along with the standard JCR Impact factor and the relatively new 5-year impact factor. Table 1 is ordered by the journal’s score on the h statistic measured over the period 2000–2009. I also include a measure of h based on the most recent five years of exposure. Two other statistics, the 5-year and 10-year g statistics, are also listed.
Conclusion
Most sociology journals examined here publish a considerable number of papers that achieve a substantial degree of scholarly visibility. The journal rankings presented here are based on the h index and draw from the Google-Scholar data base. The measures capture more citations than the traditional journal impact factor because of the longer time frame and because Google Scholar captures a broader range of citations both from journals and from other sources. The PoP software is informative because it identifies specific, highly cited papers, and thus serves as a bibliographic tool and not just a journal ranking metric.
While the position of individual journals shifts somewhat with the new measure, by and large a steep hierarchy of journals remains. It is interesting, however, to note that the top cited paper in a journal is not unduly constrained by the journal’s rank: even modestly ranked journals often publish several highly visible papers. While certain aspects of journal rankings remain controversial, in my view the practice of journal rankings is likely to remain with us, and consequently improved and more comprehensive assessments are to be preferred to more limited ones.
Available
 http://rd.springer.com/article/10.1007/s12108-015-9292-7

14 oct 2016

Web of Science, Scopus, and Google Scholar citation rates: a case study of medical physics and biomedical engineering: what gets cited and what doesn’t?

Trapp, Jamie
  Web of Science, Scopus, and Google Scholar citation rates: a case study of medical physics and biomedical engineering: what gets cited and what doesn’t? 
Australasian Physical & Engineering Sciences in Medicine
DOI 10.1007/s13246-016-0478-2
Access to the Full Text

OBJECTIVES
To examine citation trends the journal Australasian Physical & Engineering Sciences in Medicine in Web of Science, Scopus, and Google Scholar
METHODOLOGY
Sample
427 articles published in Australasian Physical & Engineering Sciences in Medicine
Design
A list of items published in Australasian Physical & Engineering Sciences in Medicine was generated from a search on Web of Science and the citation count from each of the databases was recorded against each item.the sum total of citations for all articles in the journal in the respective databases were tabulated and compared
Period analyzed: 
2007-2014
RESULTS
- The greater number of citations clearly come from Google Scholar, followed by Scopus and then Web of Science
- The proportion of articles that have zero citations are 37% for Web of Science, 29% for Scopus, and 19% for Google Scholar
- The proportion of articles which have 10 or more citations are 5% for Web of Science, 9% for Scopus, and 17% for Google Scholar


- The ratio of Google Scholar cites compared to Scopus is 1,3, and 1,5 if they are compared to the Web of Science is  (Table 1)

CONCLUSIONS

Although there are exceptions, the general trend of the databases is that Google Scholar shows a greater number of citations, followed by Scopus and then Web of Science. 

What this study adds

This study supports the findings of previous studies: Google Scholar has a much broader document and citation coverage than Web of Science and Scopus. 

27 jul 2016

Índice H de las Revistas Científicas Españolas según Google Scholar Metrics (2011-2015)

Nos es grato anunciar la publicación del Índice H de las revistas científicas españolas según Google Scholar Metrics (2011-2015), donde puede encontrar el impacto de las revistas científicas españolas a partir del recuento de citas que ofrece Google Scholar Metrics.
 Este producto surge a fin de superar una limitación importante de Google Scholar Metrics. A saber: A día de hoy no permite agrupar y ordenar las revistas según su país de publicación. Google se ha decantado por ofrecer sus rankings generales por lenguas (muestra las 100 que mayor impacto poseen), permitiendo solo en el caso de las revistas en inglés, rankings por áreas temáticas y disciplinas (4.634 publicaciones agrupadas en 8 categorías temáticas y 262 subcategorías únicas). En este caso solo presenta las 20 revistas con mayor índice h. De esta opción han quedado excluidas las revistas de los otros nueve idiomas en los que Google presenta listados (chino, portugués, alemán, español, francés, japonés, holandés e italiano). Google solo ofrece directamente información de 67 revistas españolas que mayor impacto poseen. Dichas revistas figuran dentro del listado de las 100 revistas en español de mayor índice h.
Los rankings se organizan por campos científicos y disciplinas de las revistas científicas españolas que figuran en Google Scholar Metrics (GSM). Se han identificado 1069 revistas, de las que 560 son de Ciencias Sociales, 248 de Arte y Humanidades, 142 de Ciencias de la Salud y 119 de Ciencias Naturales e Ingenierías.
Puede acceder al índice pinchando en esta dirección:

Este trabajo es continuación directa de los publicados para los períodos

2010-2014
Ayllón, J.M; Martín-Martín, A.; Orduña-Malea, E.; Ruiz Pérez, R. ; Delgado López-Cózar, Emilio (2014). Índice H de las revistas científicas españolas según Google Scholar Metrics (2009- 2013). EC3 Reports, 17.  
2009-2013
Ayllón, J.M; Martín-Martín, A.; Orduña-Malea, E.; Ruiz Pérez, R. ; Delgado López-Cózar, Emilio (2014). Índice H de las revistas científicas españolas según Google Scholar Metrics (2009- 2013). EC3 Reports, 17.  


2008-2012
 Ayllón Millán, J.M.; Ruiz-Pérez, R.; Delgado López-Cózar, E . Índice H de las revistas científicas españolas según Google Scholar Metrics (2008-2012). EC3 Reports, 7: 18 de noviembre de 2013. Accesible: 


2007-2011
Delgado López-Cózar, E.; Ayllón, JM, Ruiz-Pérez, R. (2013). Índice H de las revistas científicas españolas según Google Scholar Metrics (2007-2011). 2ª edición. EC3 Informes, 3: 9 de abril de 2013. Accesible en:


Si comparamos esta edición con las precedentes se pueden destacar los siguientes hechos: 
- Una gran estabilidad en los datos
- Un importante crecimiento en la cobertura de revistas españolas por parte de GSM: 227 revistas más en esta edición que en la correspondiente a 2010-2014 (Gráfico 1).
- Todas las áreas científicas crecen, especialmente Arte y Humanidades (35,9%) y Ciencias de la Salud (24,6%).
Por otra parte, si comparamos la cobertura de revistas españolas de GSM frente a los índices tradicionales de medición del impacto de revistas a través del recuento de citas (Gráfico 2), nos encontramos que GSM indiza más del doble de revistas que SJR (Scopus) y diez veces más que Journal Citation Reports (Web of Science). 
Queremos subrayar que el objetivo último de este trabajo es comprobar la amplitud en la cobertura que posee Google Scholar Metrics de las revistas científicas españolas, en consonancia con la línea experimental abierta por el grupo de investigación EC3 dirigida a testar las posibilidades de Google Scholar y sus productos derivados para fines evaluativos.
Granada, 27 de julio de 2016

18 jul 2016

2016 Google Scholar Metrics released: a matter of languages... and something else

2016 Google Scholar Metrics released 
WHAT IS NEW IN GOOGLE SCHOLAR METRICS 2016?

We can only be delighted by the publication of the new edition of Google Scholar Metrics (GSM) (Thursday, July 14, 2016,6:44 PM). These fifteen days of delay respect to the release of the previous version in 2015 (Thursday, June 25th, 2015,  12:16 PM) were starting to worry us, but we see now that these worries were unfounded. This year, GSM has been the last product for journal evaluation through citation analysis to be updated: the new editions of the Journal Citation Reports, Journal Metrics, and the SCImago Journal Rank were released in June.

As we said last year, we can only welcome that the American company has decided to keep supporting GSM, a free product which is also very different from traditional journal rankings. Competition is healthy, and scientists can only be pleased about this variety of search and ranking tools, especially when they are offered free of charge.

There haven't been any structural changes in this new edition. The total number of publications that can be visualised in the 2016 rankings is 7,398. Now, however, since 1,664 of them (22,5%) are classified in more than one subject area, the number of unique publications is lower: 5,734.

The main differences respect to last year's version is the inclusion of five additional language rankings (Russian, Korean, Polish, Ukrainian & Indonesian) and the removal of two language rankings: Italian, and Dutch. The addition of new language rankings is welcome as they enrich the product. That's why we don't understand why they decided to remove the Italian and Dutch rankings.



Another important change in this new version of Google Scholar Metrics is the removal of many Working Papers and Discussion Papers series. If users search "working papers""discussion papers""working paper", or "discussion paper" in GSM's search box, they will only get 7 results in total.

For example, in the previous edition of Scholar Metrics, the CEPR Discussion Papers (h5-index: 112) was ranked #4 in the general category Business, Economics & Management. This series even made it to the top 100 publications in the English ranking (93rd position). Similarly, the IZA Discussion Papers (h5-index: 82) was ranked #8 in the general category Business, Economics & Management. These two series are not to be found in the new edition of Scholar Metrics.


One might think this has been caused by a change in GSM's inclusion policies [1]. They may have decided to remove all working papers and discussion papers, but if that were the case, they shouldn't have included other working paper series, like NBER Working Papers, currently #1 in the general category Business, Economics & Management, and also #1 in the subcategory Economics. They have also maintained all the subcategories available at arXiv. This is clearly inconsistent.



  
Apart from these differences, Google has just updated the data, which means that some of the limitations outlined in previous studies still persist [2-7]: the visualisation of a limited number of publications (100 for those that are not published in English), the lack of categorisation by subject areas and disciplines for non-English publications, and normalisation problems (unification of journal titles, problems in the linking of documents, and problems in the search and retrieval of publication titles). 
There are three different entries for the Brazilian Journal of Anestesiology.

One of the main source of errors in GSM are the journals published in several languages. Journals published in their original native language and, at the same time, in English, are quite common. GSM has decided ti create separate entries for each of the languages in which a journal is published.



This decision is arguable, but at the very least, it should be applied consistently to all journals. The journal Revista Española de Cardiología, however, received a different treatment: the Spanish and English versions were merged.



In the case of Revista Española de Enfermedades Digestivas and Revista Portuguesa de Neumología, they weren't able to successfully separate the two versions, since both versions present articles in the original languages (Spanish and Portuguese, respectively), and English.

In the case of  Giornale italiano di medicina del lavoro ed ergonomia, they only identified the English version, but not the Italian one.

There are also several errores related to the correct linking of documents, which point to references or incorrect full-texts.

In some cases, like the journal Nutrición Hospitalaria, we find dead links, links to the PDFs in Scielo, links to Dialnet, and links to the various repositories where authors have archived their articles. Probably for this reason the title of the journal presents up to three variants.
  
Over the years we have detected cases of journals that don't seem to meet all the criteria set by GSM to be included in this product (mainly the minimum of 100 articles published in the las five years), and nevertheless they are included. An example of this phenomenon is the journal Area Abierta, for which there are only 43 articles published between 2011 and 2015 indexed in Google Scholar, but still is included. Additionally, the most cited article in this journal is incorrect because it actually points to an article published in another journal.
 
The journal Investigaciones de Historia Económica presents a similar case: this journal doesn't publish the minimum 100 original articles in the last five years, and still it is included in GSM. If we search articles published by this journal in Google Scholar, we see that this journal publishes a high amount of book reviews. Probably, these reviews were considered as articles when the data was computed.



Lastly, it should be reminded that journals not always present an uniform typographic design in their titles or the titles of the articles.

Having said that, there are fewer errors than in previous years.

In our previous studies, we have described again and again the underlying philosophy embedded in all of Google’s academic products. These products have been created in the image and likeness of Google’s general search engine: fast, simple, easy to use, understand and calculate?, and last but not least, accessible to everyone free of charge. GSM follows all these precepts, and it is, in the end, nothing more than:

- A hybrid between a bibliometric tool (indicators based on citation counts), and a bibliography (a list of highly cited documents, and of the documents that cite them).
- It offers a simple, straightforward journal classification scheme (although it also includes some conferences and repositories).
- It is based on two basic bibliometric indicators (the h index, and the median number of citations for the articles that make up the h index).
- It covers a single five-year time frame (the current one being 2011-2015).
- It uses rudimentary journal inclusion criteria, namely: publishing at least 100 articles during the last five-year period, and having received at least one citation.
- It provides lists of publications according to the language their documents are written in. For all of them, except for English publications (these are a total of 11: Chinese, Portuguese, German, Spanish, French, Japanese, Russian, Korean, Polish, Ukrainian and Indonesian) it offers lists of only 100 titles: those with the higher h index. For English publications, however, it shows a total of 4737 different publications, grouped in 8 subject areas. For each publication, it shows the titles of the documents whose citations contribute to the h index, and for each one of these documents, in turn, the titles of the documents that cite them.
- It provides a search feature that, for any given set of keywords, will retrieve a list of 20 publications whose titles contain the selected keywords. In the cases where there are more than 20 publications that satisfy the query, only the first 20 results, those with a higher h index, will be displayed.
- It doesn’t perform any kind of quality control in the indexing process nor in the information visualization process.

To sum up, GSM is a minimalist information product with few features, closed (it cannot be customized by the user), and simple (navigating it only takes a few clicks). If GSM wants to improve as a bibliometric toolit should incorporate a wider range of features. At the very least, it should: 

- Display the total number of publications indexed in GSM, as well as their countries and language of publication. Our estimations lead us to believe that this figure is probably higher than 40,000 [8]. In the case of Spain, there are over 1,000 publications indexed, which make up about 45% of the total number of academic publications in Spain [9-11].
- Provide some other basic and descriptive bibliometric indicators, like the total number of documents published in the publications indexed in GSM, and the total number of citations received in the analysed time frame. These are the two essential parameters that make it possible to assess the reliability and accuracy of any bibliometric indicator. Other indicators could be added in order to elucidate other issues like self-citation rates, impact over time (immediacy index), or to normalize results (citation average).
- Provide the complete list of documents of any given publication that have received n citations and especially those that have received 0 citations. This would allow us to verify the accuracy of the information provided by this product. It is true, much to Google’s credit, that this information could be extracted, though not easily?, from Google Scholar.
- Provide a detailed list of the conferences and repositories included in the product. The statement Google makes about including some conferences in the Engineering & Computer Science area, and some document collections like the mega-repositories arXiv, RePec and SSRN, is much too vague.
- Define the criteria that has been followed for the creation of the classification scheme (areas and disciplines), and the rules and procedures followed when assigning publications to these areas and disciplines.
- Enable the selection of different time frames for the calculation of indicators and the visualization and sorting of publications. The significant disparities in publishing processes and citation habits between areas (publishing speed, pace of obsolescence) require the possibility to customize the time frame according to the particularities of any given subject area.
-  Enable access to previous versions of Google Scholar Metrics (2007-2011, 2008-2012, 2009-2013, 2010-2014) to ensure that it is possible to assess the evolution of publications over time. Moreover, they could dare venture into the unknown and do something no one else has done before: a dynamic product, with indicators and rankings updated in real-time, just as Google Scholar does.
- Enable browsing publications by language, country and discipline, and directly display all results for these selections.
- Remove visualization restrictions: currently 100 results for each language and 20 for each discipline or keyword search.
- Enable the visualization of results by country of publication and by publisher.
- Enable sorting results according to various criteria (publication title, country, language, publishers), as well as according to other indicators (h index, h median, number of documents per publication, number of citations, self-citation rate…).
- Enable searching not only by publication title, but also by country and language of publication.
- Enable an option for exporting global results, as well as results by discipline, or those of a custom query.
- Enable an option for reporting errors detected by users, so they can be fixed (duplicate titles, erroneous titles, incorrect links, deficient calculations…).
- Lastly, reducing the mininimum number of articles published in the last 5 years from 100 to 50 might be a good idea. 20 articles per year is not a difficult goal for journals written in English, especially in areas like natural sciences and health. However, there are many local journals published in non-English-speaking countries, especially in the Arts & Humanities, that just can't reach that amount of articles.
Dixit two years ago


Bibliography

1. Delgado López-Cózar, Emilio & Robinson-García, Nicolás (2012). Repositories in Google Scholar Metrics or what is this document type doing in a place as such? Cybermetrics, 16(1), paper 4. http://digibug.ugr.es/bitstream/10481/22019/1/repositorios_cybermetrics.pdf
2. Delgado López-Cózar, E; Cabezas-Clavijo, Á (2012). Google Scholar Metrics updated: Now it begins to get serious. EC3 Working Papers 8: 16 de noviembr de 2012. Available: http://digibug.ugr.es/bitstream/10481/22439/6/Google%20Scholar%20Metrics%20updated.pdf
3. Delgado-López-Cózar, E., y Cabezas-Clavijo, Á. (2012). Google Scholar Metrics: an unreliable tool for assessing scientific journals.El Profesional de la Información, 21(4), 419–427. Available: http://dx.doi.org/10.3145/epi.2012.jul.15
4.Cabezas-Clavijo, Á., y Delgado-López-Cózar, E. (2012). Scholar Metrics: el impacto de las revistas según Google, ¿un divertimento o un producto científico aceptable? EC3 Working Papers, (1). Available: http://eprints.rclis.org/16830/1/Google%20Scholar%20Metrics.pdf
5. Cabezas-Clavijo, Álvaro; Delgado López-Cózar, Emilio (2013). Google Scholar Metrics 2013: nothing new under the sun. EC3 Working Papers, 12: 25 de julio de 2013. Available: http://arxiv.org/ftp/arxiv/papers/1307/1307.6941.pdf
6. Martín-Martín, A.; Ayllón, J.M.; Orduña-Malea, E.; Delgado López-Cózar, E. (2014). Google Scholar Metrics 2014: a low cost bibliometric tool. EC3 Working Papers, 17: 8 July 2014. http://arxiv.org/pdf/1407.2827
7. Martín-Martín, A.; Ayllón, J.M.; Orduña-Malea, E.; Delgado López-Cózar, E. (2015). G2015 Google Scholar Metrics: happy monotony. EC3 Google Scholar Digest, 26 Jun 2015. http://googlescholardigest.blogspot.com.es/2015/06/google-scholar-metrics-2015-happy.html
8. Delgado López-Cózar, E.; Cabezas Clavijo, A. (2013). Ranking journals: could Google Scholar Metrics be an alternative to Journal Citation Reports and Scimago Journal Rank? Learned Publishing, 26 (2): 101-113. Available: http://arxiv.org/ftp/arxiv/papers/1303/1303.5870.pdf  
9. Delgado López-Cózar, E.; Ayllón, JM, Ruiz-Pérez, R. (2013). Índice H de las revistas científicas españolas según Google Scholar Metrics (2007-2011). 2ª edición. EC3 Informes, 3: 9 de abril de 2013. Available: http://digibug.ugr.es/handle/10481/24141  
10. Ayllón Millán, J.M.; Ruiz-Pérez, R.; Delgado López-Cózar, E. Índice H de las revistas científicas españolas según Google Scholar Metrics (2008-2012). EC3 Reports, 7 (2013). Available: http://hdl.handle.net/10481/29348
11. Ayllón, Juan Manuel; Martín-Martín, Alberto; Orduña-Malea, Enrique; Ruiz Pérez, Rafael ; Delgado López-Cózar, Emilio (2014). Índice H de las revistas científicas españolas según Google Scholar Metrics (2009-2013). EC3 Reports, 17. Granada, 28 de julio de 2014. Available: http://hdl.handle.net/10481/32471
Granada, July 15, 2016, 22:10 PM.