Discovering and Analyzing Scientific Communities using Conference Network

Mussi Campos Cervera, Alejandro (2010) Discovering and Analyzing Scientific Communities using Conference Network. UNSPECIFIED.

Download (3033Kb) | Preview


    The increase number of scientific publications has made digital scientific literature search a difficult task and highly dependent of the researcher ability to search, filter and classify content. Most used scientific literature search engines and portals, such as Google Scholar, Citeseer and ACM, use only simple text-base and citation-base score to rank the query result, and the rank is barely useful. The number of references that a scientific publication has received (known as citations) determines the impact that the contribution has made to the community. Many methods (known as index) to measure or rank researchers are citation based. A fair index for these is important because it is used to evaluate and compare researchers for different purpose, such as university recruitment, faculty advancement, award of grants, among others. The world of science has many fields (Human, Social, Computer Science, etc.). Each field has different structures and publication dynamics. %\ale{Cite newman2001 and Structure and Dyna..}. An example is the number of citations in the top-20 most cited journals in Computer Science is 4 times higher than the top-20 most cited journals in Social Science. Therefore, it is unfair to compare researchers using citation-based metrics without a context, in other words, the community they belong to. Different sizes of communities make currently most used metrics that measure the productivity or impact of researchers an unfair evaluation when comparing researchers from different communities since those with higher productivity are likely to produce more citations than communities with lower productivity. This thesis presents a model and a tool for the detection and evaluation of scientific communities. Moreover, the detection of them will allow the improvement of two important activities in scientific research area: First, the of scientific contributions. Being aware of the existing relations between scientific entities by knowing the communities they are part of, will enable more efficient search mechanisms since the domain of the queries can be narrowed down to particular communities, or can be sparse to different communities to obtain diversity of content. Moreover, having a framework that supports discovering scientific communities will provide the means for a better understanding of the social behavior in the scope of scientific research, enabling us the possibility to identify patterns in developments of projects, research trends, successful research profiles, and so on. Second, the assessment of people (researchers). In InfEur2008 is suggested that numerical indicators must not be used to compare researches or researchers across different disciplines. Since nowadays the boarders between disciplines are blurring, it is hard to define a priori the disciplines to which someone belongs. Ad-hoc and evolving communities can provide a better way for this. The approach presented in this thesis combines different clustering algorithms for detecting overlapped scientific communities, based on conference publication data. The Community Engine Tool (CET) has implemented the algorithm and has been evaluated using the DBLP dataset, which contains information on more than 12 thousand conferences. The results showed that using our approach makes it possible to automatically produce community structure close to human-defined classification of conferences. The approach is part of a larger research effort aimed at studying how scientific communities are born, evolve, remain healthy or become unhealthy (e.g., self-referential), and eventually vanish.

    Item Type: Departmental Technical Report
    Department or Research center: Information Engineering and Computer Science
    Subjects: Q Science > QA Mathematics > QA076 Computer software
    Uncontrolled Keywords: "scientific communities", communities, "discovering communities", liquidpub
    Report Number: DISI-10-072
    Repository staff approval on: 21 Jan 2011

    Actions (login required)

    View Item