This includes the number of recommendations per set usually ten , how many recommendations were clicked, the date of creation and delivery, the time required to generate the set and corresponding user models, and information on the algorithm that generated the set. It generalizes over all users, and assumes that they are all researchers which is not exactly true, because some users only use Docear for its mind-mapping functionality. Docear displays recommendations as publicly available research papers on the Web. Due to spam issues, no new anonymous users were allows since late By Joeran Beel and Bela Gipp. Instead of indexing the original citation placeholder with ,  , etc. For instance, we found that older users are more likely to click on recommendations than younger users [ 8 ], and that the labelling of recommendations has an effect on user satisfaction [ 4 ].
Local users chose not to register when they install Docear. The exact matching algorithm is randomly arranged. Researchers and developers in the field of recommender systems can benefit from publicly available architectures and datasets1. The recommendations are not yet delivered to the user but only stored in the database. All nodes of the mind-maps, including attributes text, links to files, titles of linked PDFs, and bibliographic data are extracted from the XML file and stored in a graph database neo4j. To match user models and recommendation candidates, Apache Lucene is used, i.
Architecture of Docear’s research paper recommender system. Kris Jack, et al.
Third parties could use the Web Service, for instance, to request recommendations for a particular Docear user and to use the 4. This means, on average, each user has linked or cited 92 documents in his or her mind-maps.
Rich, “User modeling via stereotypes,” Cognitive sciencevol. Due to spam Desktop software was started. Due to privacy concerns, this dataset does not contain the mind-maps themselves but only metadata. The CTR expresses the ratio of received and clicked recommendations.
Information aboutrevisions of The file papers. However, it should be noted that, for now, we developed the Web Service only for internal use, that there is no documentation available, and that the URLs might change without prior notification.
In this paper, we present the architecture of Docear’s research paper recommender system. recpmmender
For the first step, the feature type to use from the mind-maps is randomly each user, the label is randomly chosen, when the user registers. CiteSeer’s dataset has been frequently used by researchers for evaluating research paper recommender systems [ 12 ], [ 14 ], [ 16 – 22 ]. The user modeling process varies by a number of variables above the recommendations Figure 2. The recommendations are not yet delivered to the user but only stored in the database.
Between March and Marcharound 1, users registered every month, resulting in 21, registered users. While the research paper dataset is rather small, and the metadata is probably of a rather low quality, the dataset contains 1.
If we would disable search results, a set of ten papers is randomly selected as statistics, concentrate on a few algorithms, and use a dedicated server recommendations. Consequently, they cannot use Docear’s online services such as recommendations or online backup, and we do not have any information about these users, nor do we know how many local users there are. The datasets were not originally intended for recommender stored.
He is interested paaper literature recommender systems, search engines and human computer interaction. Bollen and van de Sompel published an architecture that later served as the foundation for the research paper recommender system bX [ 27 ].
This paper will present related work, provide a general overview of Docear and its recommender system, introduce the architecture, and present the datasets.
All datasets are available here. The CTR expresses the ratio of received and clicked introduucing. This rather long computing time is primarily caused by the many statistics that we calculate for each set of recommendations, along with a few algorithms that require intensive computing power.
The Web Service stores some statistics, such as when the recommendations where requested and from which Docear version. The mind-map dataset is smaller than the dataset e. Log In Sign Up. Due to spam issues, no new anonymous users were allows since late Each PDF is converted into text, and the header information and citations are extracted.
Between March and MarchDocear delivered 31, recommendation sets withrecommendations to 3, users.
Our datasets allow analyses beyond the analyses we have already published, for instance to evaluate collaborative filtering algorithms, Keywords perform citation analysis, or explore the use of reference managers. A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation. Screenshot of Docear Every five days, recommendations are displayed to the users at the start-up of Docear.
In this case, no full-text dataset see section 2.