• OriginStamp
    Trusted Time Stamping via Bitcoin

    OriginStamp is a web-based, trusted timestamping service that uses the decentralized Bitcoin block chain to store anonymous, tamper-proof time stamps for any digital content. OriginStamp allows users to hash files, emails, or plain text, and subsequently store the created hashes in the Bitcoin block chain as well as retrieve and verify time stamps that have been committed to the block chain. OriginStamp is free of charge and easy to use and thus allows anyone, e.g., students, researchers, authors, journalists, or artists, to prove that they were the originator of certain information at a given point in time.

  • CitePlag
    Citation-based Plagiarism Detection

    CitePlag is the first plagiarism detection system to implement Citation-based Plagiarism Detection (CbPD) – a novel approach capable of detecting also heavily disguised plagiarism in academic texts. Existing plagiarism detection software only examines literal text similarity, and thus typically fails to detect disguised plagiarism forms, including paraphrases, translations, or idea plagiarism. CbPD addresses this shortcoming by additionally analyzing the citation placement in the full-text of documents to form a language-independent semantic “fingerprint” of document similarity

  • Docear
    Academic Literature Management via Mind Maps

    Docear combines a mind-mapping tool with a recommender system for academic literature and a reference manager. The mind maps allow users to organize their ideas and to import the annotations they made while reading PDFs, e.g., comments, highlights or bookmarks. The software works with standard PDF annotations, thus can be used with different PDF viewers. 

  • Image adapted from Wikipedia.

    Co-Citation Proximity Analysis
    Recommendation and Clustering Algorithms for Academic Literature

    Co-Citation Proximity Analysis (CPA) is a method to compute both local and global instances of semantic similarity in academic documents by examining citation proximity in the full texts of documents.

    CPA was developed with two applications in mind: recommender systems and clustering. Regarding the first application, an improved measure of document semantic similarity, which computes similarity at a more fine-grained resolution, has the potential to significantly improve the relevance of academic literature recommendations. 

  • news-please
    an integrated web crawler and information extractor for news

    news-please is an open source, easy-to-use news crawler that extracts structured information from almost any news website. It can follow recursively internal hyperlinks and read RSS feeds to fetch both most recent and also old, archived articles. You only need to provide the root URL of the news website. news-please also features a library mode, which allows developers to use the crawling and extraction functionality within their own program. 

  • Mr. DLib
    Machine-readable Digital Library

    Mr. DLib's "Recommendations as a Service" allows operators of academic products to easily integrate a scientific recommender system into their products. The basic idea of Mr. DLib's scientific recommender system is to calculate recommendations for research articles, call for papers, grants, etc. on Mr. DLib's server. Operators of academic products may then request recommendations from Mr. DLib and display the recommendations to their users. 

  • CITREC
    Open Evaluation Framework for Citation-based Similarity Measures

    CITREC is an open evaluation framework for citation-based and text-based similarity measures. CITREC prepares the data of two formerly separate collections for a citation-based analysis and provides the tools necessary for performing evaluations of similarity measures.

At one glance