Rafols, I. & Meyer, M. (2010). Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience. Scientometrics, 82, 263-287.
Scientometrics
本研究從知識整合(knowledge integration)的觀點提出分析跨學科(interdisciplinarity)過程的架構,在此架構裡包括文獻(知識)來自多個學科的多元性(diversity)以及文獻間由相似性連結成的網路凝聚性(network coherence)等兩個概念。本研究將文獻的學科多元性表達成其參考文獻(如果數量不足的話可以使用參考文獻的參考文獻)之發表期刊在ISI資料庫內所屬的主題類別(Subject Categories)上的分布情形,包括以下的幾種現象:1)種類數量(variety):如果文獻分布於較多的主題類別,明顯地應該是較具跨學科的可能性;2)分布的平衡性(balance):如果在學科類別上的數量分布不平均的話,便表示該文獻具有較高的學科多元性;3)文獻所分布主題類別的差距(disparity)或相似性(similarity):分布的類別差異愈大,文獻的學科多元性可能愈高。基於上述對學科多元性的剖析,本研究嘗試利用下面的指標來估測:1)分布的類別數目(N);2)由在各類別上的分布機率計算得到的entropy H, pi*log(pi);3)文獻在主題類別上分布的相似性 I (Simpson diversity) pi*pj ;4) Stirling指標Delta pi*pj*dij,dij是類別間的差異性,sij=1-dij定義為以引用其他類別的次數進行cosine計算引用上的相似性;並且將各主題類別的關係映射成網路圖,以文獻的參考文獻分布在各主題類別的數量決定節點的大小,以圖形視覺化地呈現其學科多元性。網路凝聚性(network coherence)則對於文獻的參考文獻之間由書目耦合(bibliographic coupling)的關係建構起來的網路進行結構分析,運用1)平均連結強度(mean linkage strength) S 和2)平均路徑長度(mean path length) L 來進行估測。平均連結強度的計算是網路圖上任何兩點間的連結強度的平均值,也就是所有參考文獻之間書目耦合關係的平均值;平均路徑長度則是網路圖上任何兩點間的最短路徑長度的平均值。研究結果發現學科多元性指標的N, H, I和Delta與代表網路凝聚性的S和L之間的相關性並不強,這個結果表示這兩種概念分別代表跨學科裡的不同面向。事實上,N甚至與其他三種學科多元性的指標沒有明顯的關連,結果也較不容易解釋。H, I和Delta之間則有相當高的相關性,特別是H和Delta之間,所以可以利用此三者之一來測量文獻的學科多元性,從視覺化的結果也可以容易地發現具有較高學科多元性的文獻。最後,S和1/L之間也同樣具有很高的相關性,兩者皆能夠用來測量。綜合以上的討論,本研究建議在測量文獻的跨學科性時,可以使用Stirling指標 Delta和平均連結強度。依據這兩種指標的高低,可以將要測量的文獻分為四種情形:1)低多元性-高凝聚性:已經專殊化的學科研究,參考文獻大多屬於同一個學科;2)低多元性-低凝聚性:參考文獻屬於同一學科但分屬多個不同的研究專業(specialties);3)高多元性-低凝聚性:參考文獻分屬於多種學科但尚未經過完全整合的情形;4)高多元性-高凝聚性:原先的參考文獻雖然分屬於多種學科但已經完全整合的情形。
We propose a conceptual framework that aims to capture interdisciplinarity in the wider sense of knowledge integration, by exploring the concepts of diversity and coherence.
Disciplinary diversity indicators are developed to describe the heterogeneity of a bibliometric set viewed from predefined categories, i.e. using a top-down approach that locates the set on the global map of science.
Network coherence indicators are constructed to measure the intensity of similarity relations within a bibliometric set, i.e. using a bottom-up approach, which reveals the structural consistency of the publications network.
We carry out case studies on individual articles in bionanoscience to illustrate how these two perspectives identify different aspects of interdisciplinarity: disciplinary diversity indicates the large-scale breadth of the knowledge base of a publication; network coherence reflects the novelty of its knowledge integration.
Review of bibliometric studies on interdisciplinarity
Most investigations use a top-down approach and predefined categories (typically ISI Subject Categories—SCs) to study their proportions and/or relations. For example, van Raan and van Leeuwen (2002) describe interdisciplinarity in an institute in terms of the percentage of publications and citations received to and from each SCs.
Some investigations adopt a bottom-up approach, in which the low-level elements investigated (e.g. publications, papers) are clustered or classified into factors on the basis of multivariate analyses of similarity measures (Small 1973; Braam et al. 1991; van den Besselaar and Leydesdorff 1994; Schmidt et al. 2006). These clusters are then projected in 2D or 3D maps to provide an insight into the structure of the field and estimate the degree of network-level similarity. Similarity measures have also been used to compute network properties, such as centralities, to identify interdisciplinarity (Otte and Rousseau 2002; Leydesdorff 2007).
In this framework, we view the knowledge integration as a dynamical process that is characterised by high cognitive heterogeneity (diversity) and increases in relational structure (coherence); in other words as a process in which previously different and disconnected bodies of research become related.
Diversity: concept and measures
The concept of diversity is used in many scientific fields, from ecology to economics and cultural studies, to refer to three different attributes of a system comprising different categories (Stirling 1998, 2007; Purvis and Hector 2000):
• variety: number of distinctive categories;
• balance: evenness of the distribution of categories;
• disparity or similarity: degree to which the categories are different/similar.
• variety: number of distinctive categories;
• balance: evenness of the distribution of categories;
• disparity or similarity: degree to which the categories are different/similar.
Our interest in using Stirling’s framework to track interdisciplinarity is twofold.
First, since Stirling’s generalised formulation needs a metric (dij) and has open values for the parameters a and b, it highlights that the mathematical form of any diversity index includes some prejudgement of the aspect of diversity that is considered important. High values for b give more weight to the contribution of large categories, and high values for a see the cooccurrence of distant categories as more important. The choice of the metric used to define distance is inevitably value laden.
Second, and very importantly for emerging fields, the inclusion of distance among categories lessens the effect of inappropriate categorisation changes: if a new category i is very similar to an existing category j, their distance dij will be close to zero, and its inclusion in categories list will result in only slightly increased diversity.
First, since Stirling’s generalised formulation needs a metric (dij) and has open values for the parameters a and b, it highlights that the mathematical form of any diversity index includes some prejudgement of the aspect of diversity that is considered important. High values for b give more weight to the contribution of large categories, and high values for a see the cooccurrence of distant categories as more important. The choice of the metric used to define distance is inevitably value laden.
Second, and very importantly for emerging fields, the inclusion of distance among categories lessens the effect of inappropriate categorisation changes: if a new category i is very similar to an existing category j, their distance dij will be close to zero, and its inclusion in categories list will result in only slightly increased diversity.
Coherence: concept and measures
In our bibliometric context, coherence expresses the extent to which publication networks form a more or less compact structure. If we take degree of cognitive similarity as the linkage between publications (e.g. by using co-citation, co-word or bibliographic coupling), a more clustered network is seen as having higher cognitive coherence.
However, since the key aspect of interdisciplinary research has been argued to be the dynamical process of knowledge integration (section ‘‘Definition of interdisciplinarity’’), interdisciplinarity should ideally be assessed in terms of a temporal derivative, i.e. a change in coherence.
High coherence within the reference set in a publication means that its referencing practices are highly specialised and hence, that it builds on an already established research specialty.
(i) Low diversity—High coherence is a case of specialised disciplinary research—all the references are from the same discipline and are related.
(ii) Low diversity—Low coherence is a case of a publication relating distant research specialties within one discipline.
(iii) High diversity—Low coherence is a case of a publication citing references that were hitherto unrelated and belong to different disciplines: a potential instance of interdisciplinary knowledge integration.
(iv) High diversity—High coherence is a case of a publication citing across several disciplines, to references that are similar. This similarity suggests that the references belong a single research specialty. Hence, although the publication is interdisciplinary, it does not involve new knowledge integration.
(ii) Low diversity—Low coherence is a case of a publication relating distant research specialties within one discipline.
(iii) High diversity—Low coherence is a case of a publication citing references that were hitherto unrelated and belong to different disciplines: a potential instance of interdisciplinary knowledge integration.
(iv) High diversity—High coherence is a case of a publication citing across several disciplines, to references that are similar. This similarity suggests that the references belong a single research specialty. Hence, although the publication is interdisciplinary, it does not involve new knowledge integration.
Operationalisation of disciplinary diversity
The disciplinary diversity of an article was constructed from the distribution of ISI SCs in the references of references (ref-of-refs in Fig. 3, and hereafter) of an article. To compute this distribution, we constructed a frequency list of the journals in which the ref-of-refs were published, and converted it into a frequency list of ISI SCs using the SC attribution of each journal as given in the Journal Citation Reports.
In order to compute the Stirling D diversity, a similarity matrix sij for the SCs must be constructed. To do so, we created a matrix of citation flows matrix between SCs, and then converted it into a Salton’s cosine similarity matrix in the citing dimension. The sij describes the similarity in the citing patterns for each pair of SCs in 2006, for the SCI set (175 SCs).
Operationalisation of network coherence
In order to operationalise network coherence for our bibliometric set,we chose first, a similarity metric between network elements (articles) in order to measure the strength of their linkages; second, an indicator of structural coherence of the network. Since the aim was to map the breadth of knowledge sources, similarity was measured in terms of bibliographic couplings between articles (co-occurrences of references), and normalised using Salton’s cosine (Ahlgren et al. 2003). Then, basic network measures were used as indicators for network coherence:
• Mean linkage strength, S: the mean of the bibliographic coupling matrix, excluding the diagonal—equivalent to network density in binary networks. In valued networks, it describes both realised links and intensity of similarities. By definition, S has a value between zero and 1.
• Mean path length, L: the path length between two articles is defined as the minimum number of links crossed to go from one article to the other over the network. Mean path length describes how ‘spread’ the network is; it is computed after binarising similarities.
• Mean path length, L: the path length between two articles is defined as the minimum number of links crossed to go from one article to the other over the network. Mean path length describes how ‘spread’ the network is; it is computed after binarising similarities.
Diversities H, I and D were found to be correlated.
Interestingly, the highest correlation was between Shannon H and Stirling D, although Stirling D and Simpson I (rather than Shannon) have similar mathematical formulations.
Since Shannon H gives more weight to the small terms in its sum through its logarithmic factor, while Stirling D gives more weight to the combinations of disparate SCs, we believe that the high correlation between H and D is due to the fact that many SCs with small proportions happen also to be distant from the core SCs.
Interestingly, the highest correlation was between Shannon H and Stirling D, although Stirling D and Simpson I (rather than Shannon) have similar mathematical formulations.
Since Shannon H gives more weight to the small terms in its sum through its logarithmic factor, while Stirling D gives more weight to the combinations of disparate SCs, we believe that the high correlation between H and D is due to the fact that many SCs with small proportions happen also to be distant from the core SCs.
Indicators of coherence, S and 1/L, were also highly correlated with one another, but not with the diversity measures.
Variety N was not correlated with any other measure, and it does not seem to be a good indicator of knowledge integration.
In this article, we proposed a novel conceptual framework to investigate interdisciplinary processes in the wider sense of knowledge integration. The framework is based on the concepts of diversity and coherence, ....
Diversity was used to capture the disciplinary heterogeneity of our bibliometric set as seen through the filter of predefined categories, i.e. taking a top-down perspective in order to locate the set on the global map of science (Fig. 6).
Coherence was used to apprehend the intensity of similarity relations within the bibliometric set, i.e. using a bottom-up approach to reveal the structural consistency and cognitive articulation of the publications network (Fig. 7).
Diversity was used to capture the disciplinary heterogeneity of our bibliometric set as seen through the filter of predefined categories, i.e. taking a top-down perspective in order to locate the set on the global map of science (Fig. 6).
Coherence was used to apprehend the intensity of similarity relations within the bibliometric set, i.e. using a bottom-up approach to reveal the structural consistency and cognitive articulation of the publications network (Fig. 7).
Disciplinary diversity indicators were constructed from diversity indices (Shannon H and Simpson I) and a recently developed indicator (Stirling D, parameterised as Porter’s Integration), which takes account of the similarities between SCs (Stirling 1998, 2007; Porter et al. 2007). ISI SCs were used as disciplinary categories.
Network coherence was operationalised in terms of the network measures Mean linkage strength and mean path length, in bibliographic coupling networks (see Havemann et al. 2007 for a similar approach).
First, we found that the indicators for disciplinary diversity and network coherence were not correlated (Table 4), thus providing ‘orthogonal’ perspectives of the knowledge integration process.
Since there is a trade-off between accuracy and simplicity of a taxonomy, it is possible that the unit of analysis (the article) in this study is too small for the coarse-grained description of science provided by ISI SCs.
Third, we found that measures for network coherence could discriminate among articles according to their different degrees of knowledge integration at micro level. ... The operationalisation of network coherence in terms of mean linkage strength of bibliographic coupling appeared to work well, both for our small sets and in larger studies reported by Havemann et al. (2007). Moreover, it has the advantage of simplicity.
Fourth, the visualisations of diversity (through the overlay of disciplinary proportions on the map of science, Fig. 6), and of coherence (by means of the bibliographic coupling network, Fig. 7), proved more valuable than expected.
沒有留言:
張貼留言