Zhang, L., Rousseau, R., & Glänzel, W. (2016). Diversity of references as an indicator of the interdisciplinarity of journals: Taking similarity between subject fields into account. Journal of the Association for Information Science and Technology, 67(5), 1257-1265.
本研究利用文章參考文獻列表上引用項目的主題領域的多樣性測量期刊的跨學科性,主題領域和次領域則是根據Leuven-Budapest (ECOOM)的主題分類綱要,並且從種類(variety)、平衡(balance)和差距(disparity)等方向,運用Hill-type true diversity測量多樣性。在應用於多個學科的各期刊進行測量其跨學科性後,本研究也將檢驗跨學科性高的期刊是否具有更大的知名度與影響力。
對於多樣性(diversity)的研究中,一般多考慮分佈的均勻性和不同種類的數量是其兩個主要維度,Stirling(2007)建議加入網路結構的差異,使得多樣性概念更為精確,引此導出Rao-Stirling多樣性測量。然而 在Leydesdorff and Rafols (2011)和Zhou, Rousseau, Yang, Yue, and Yang (2012)的研究,Rao-Stirling多樣性測量都沒有比過去使用的Shannon熵(entropy)或是Gini係數更好的效果。
美國國家科學院(the
National Academies of the USA)對跨學科性的定義,跨學科研究是團隊或個人研究的一種模式,它整合了來自兩個或更多學科或專業知識體系的資訊、資料、技術、工具、觀點、概念和/或理論,以促進基本理解或解決超越單一學科的問題。Rafols和Meyer(2010)根據這個定義為研究跨學科性提供了一個框架,他們將多樣性思想與網絡一致性(network coherence)相結合,產生相應的視覺化圖形。
Jost (2009)指出true diversity須滿足六項要求:
1. 對稱性(symmetry)
2. 增加值為0的種類,並不會改變多樣性的數值 (zero output independence)
3. 轉移原則(transfer principle),將稀有的種類中較豐富者轉移到較一般者將會降低多樣性。
4. 同質性(homogeneity),多樣性僅取決於種類間的相對頻率,而不為種類上的絕對豐富程度所決定。
5. 複製原則(replication principle),假設m個群體具有相同的物種豐度集合,但沒有任何物種在任何群體之間共享,所有的m群體必然具有相同的多樣性D0。而且當匯集m個群體,此時全體的多樣性為mD0。
6. 正規化(Normalization),如果多樣性機制應用於N個同樣常見的種類,則其值為N.
在符合上述的六項要求下,多樣性的比例才有意義。
本研究援引Leinster & Cobbold (2012)提出測量方式。這個方式符合下列要求:
1. N個同樣豐富,完全不相似的物種的多樣性是N。
2. 假設社區被劃分為m個子群體,子群體之間沒有物種共享,而不同的子群體的物種完全不同。那麼群體的多樣性完全取決於群體的規模和多樣性。
3. 此外,如果這些m個子群體的大小相等,D0相等,那麼整個群體的多樣性就是m·D0。
4. 多樣性並不因原來列出的物種的順序而改變。
5. 增加一個新的物種但數量為0,多樣性是不變的。
6. 如果兩個物種相同,合併後,多樣性不變。
7. 當物種間的相似性增加,多樣性隨之減少。
8. 不考慮物種間的相似性時,其多樣性較大。
9. N個物種的多樣性在1與N之間。
利用ECOOM的16個主要主題領域和68個次領域做為多樣性評量的主題系統,針對七種期刊的論文進行跨學科性測量。結果發現以主要主題領域和次領域分別做為評量類別所得到的結果有很大的差異:"Journal of the American Geriatrics Society"使用ECOOM次領域的測量結果名列第二,而在主要領域的結果中排名第五;與其相比,"Scientometrics"在主題系統從次領域向主要領域轉變時,排名明顯提高(從第四位躍升到第一位)。這結果表示,在以次領域為主的區域層次,Journal of the American Geriatrics Society較多樣性,但Scientometrics的參考文獻較廣布於廣域層次的主要領域。"Bioinformatics"則是在區域層次和廣域層次上都有很較高的多樣性。兩種數學期刊,如事先所料,在兩個層次的多樣性評量都不高。而比較Nature和Science兩個一般認為的多學科期刊,前者在兩個層次上的多樣性評量都較高。
Larivière和Gingras(2010)等人使用2000年發表的論文引用其他WoS類別(不是論文的期刊所屬的類別)的比例,做為跨學科性的指標,總體來說,論文的跨學科性和它們收到的引用數量之間沒有相關性,但每個學科有所差異。對大多數學科來說,有一個最佳的跨學科性層級,既不是最少也不是最多的跨學科性的文章被引用最多。本研究則以Leinster & Cobbold (2012)提出的多樣性測量方式進行這項研究,並與論文發表後三年內收到的引用次數進行比較。研究結果發現:引用和多樣性之間的關係在期刊和期刊之間是非常不同的。在Nature與Science等兩個多學科期刊中,平均引用次數隨著多樣性的增加而增加,在4-6達到最高值。Journal of the American Geriatrics Society 和 Scientometrics也有相似的引用行為。但Bioinformatics與上述期刊並不相同,隨著多樣性增加,其平均引用次數呈下降趨勢。這兩本數學期刊則是隨著多樣性的增加,往往收到更多的引用。對抽象理論的學科而言,多樣性的增加似乎意味著更廣泛的適用性,進而導致引用的增加。
The objective of this article is to further the study of
journal interdisciplinarity, or, more generally, knowledge
integration at the level of individual articles.
Interdisciplinarity
is operationalized by the diversity of subject
fields assigned to cited items in the article’s reference
list.
Subject fields and subfields were obtained from
the Leuven-Budapest (ECOOM) subject-classification
scheme, while disciplinary diversity was measured
taking variety, balance, and disparity into account.
As
diversity measure we use a Hill-type true diversity in the
sense of Jost and Leinster-Cobbold.
Zhang, Glänzel, and Liang (2009, 2010) applied the entropy
indicator to measure how far cross-citation links are spread
among other journals, and compared the result with “centrality”
measures. The authors found a clear divergence
between strongly interlinked and high-entropy journals.
Rafols and Meyer (2010) provided a framework for the
study of interdisciplinarity, where interdisciplinarity is
understood as knowledge integration.
Most important,
following Jost (2006, 2007, 2009) and Leinster and Cobbold
(2012), we oppose the use of measures such as the Shannon
entropy and the Rao-Stirling measure, and use their
Hill-type numbers.
In order to
be able to do so, we interpret interdisciplinarity as a kind of
diversity and will first shed light on the mathematical background
of measuring several aspects that are usually associated
with diversity, namely, variety, balance, and disparity.
Stirling (2007) and Leinster and Cobbold (2012) pointed
out that the notion of diversity has three components:
variety, balance, and disparity. Each of them, considered
separately, is necessary but not yet sufficient to measure
diversity in an adequate manner. Neglecting one of these
three aspects may distort the final assessment of diversity.
Variety is the number of nonempty categories to which
system elements are assigned. In particular, it is the answer
to the question: How many types of things do we have? In
information science it may be the answer to the question: In
how many different journals has this author published?
Assuming that all things are equal, the greater the variety,
the greater the diversity.
Balance is a function of the pattern of the assignment of
elements across categories. It is the answer to the question:
What is the relative number of items of each type? Balance
is also called evenness (in ecology) and concentration (in
economics). Evenness can be represented by the Lorenz
curve (Nijssen et al., 1998). The Gini index is a well-known
concentration or evenness measure (actually if G denotes the
Gini concentration measure, then g = 1-G is the corresponding
measure of evenness). In information science one may
consider, for instance, how many articles an author has
published in each journal. All else being equal, the more
balanced the distribution, the larger the diversity.
Disparity refers to the manner and the degree in which
things may be distinguished. It is the answer to the question:
How different from each other are the types of things that we
observe? For instance, publishing only in library and information
science (LIS) journals shows less disparity than publishing
in LIS and management and economics journals. All
else being equal, the higher the disparity, the greater the
diversity.
Mathematically speaking, variety is a positive, natural
number as categories are numbered in sequence; balance is
a function of fractions summing up to one, and disparity is a
function of a matrix of distances (or similarities).
The problem now is how to find a single index that can
aggregate properties of variety, balance, and disparity in a
meaningful way and without much loss of information.
The interdisciplinarity of a publication is operationalized
by the diversity of subject classifications over the publication’s
references. Measures of diversity are calculated for
each publication by classifying its references into one or
more disciplines.
• Variety corresponds to the number of subject fields to which
references of an individual paper can be assigned. A publication
will have a high variety if its references are assigned to
many different subject fields.
• Balance describes the evenness of the distribution of the
subject field classifications. A publication will have high
balance if the proportion of references is evenly distributed
across categories (e.g., three for cell biology, three for physiology,
and three for microbiology) and low balance if they are
unevenly distributed (10 for cell biology, one for physiology,
and one for microbiology).
• Disparity is taken into account by the distance between
subject fields the references have been assigned to.
It seems that for an abstract-theoretical field,
such as mathematics and logic, with generally low diversity,
an increase in diversity points to a broader applicability
and hence an increase in citations.
沒有留言:
張貼留言