An end-to-end tool to accelerate AI research by automatically fetching papers, synthesizing their contents, and organizing them into interactive knowledge graphs.
This application uses Natural Language Processing (NLP) models to extract summaries and core claims from arXiv papers and visualizes the connections between research papers based on semantic similarity.
- Automated Research Retrieval: Fetch recent research papers directly from arXiv based on your search topic.
- AI-Powered Synthesis: Automatically summarizes paper abstracts and extracts the core claims/contributions using Hugging Face Transformers.
- Semantic Similarity Analysis: Computes the semantic similarity between papers using Sentence-Transformers to discover connections.
- Interactive Knowledge Graphs: Builds and renders an interactive knowledge graph using NetworkX and Pyvis, illustrating how different research papers are related.
- Modern Dashboard: An intuitive Streamlit frontend enabling seamless interaction, search configuration, and visualization exploration.
- Robust API Backend: A FastAPI-based backend architecture handling the pipeline from data retrieval to graph generation.
For recent updates, refer UPDATE_LOG.
.
├── app.py # FastAPI backend application
├── requirements.txt # Python dependencies
├── backend/ # Backend core logic
│ ├── fetch_papers.py # arXiv data retrieval
│ ├── summarize.py # Abstract summarization
│ ├── claim_extractor.py # Core claim extraction
│ ├── embeddings.py # Similarity matrix computation
│ ├── graph_builder.py # Knowledge graph generation
│ └── graph_visualizer.py # Graph HTML visualization
├── frontend/ # Frontend UI
│ └── streamlit_app.py # Streamlit dashboard application
├── lib/ # Additional utilities/modules
└── data/ # Directory for generated outputs (e.g., graph.html)
- Backend Framework: FastAPI
- Frontend UI: Streamlit
- NLP & Embeddings: Transformers, Sentence-Transformers, PyTorch
- Graph & Visualization: NetworkX, Pyvis
- Data Processing: Scikit-learn, NumPy, SciPy
Ensure you have Python 3.8+ installed. It is recommended to use a virtual environment.
- Clone this repository or open the project directory.
- Install the required dependencies:
pip install -r requirements.txtThe application consists of a backend API and a frontend dashboard. You need to run both concurrently.
Run the FastAPI server using uvicorn (from the root directory):
uvicorn app:app --reload --host 0.0.0.0 --port 8000The backend API will be available at http://localhost:8000. You can view the API documentation at http://localhost:8000/docs.
In a new terminal window, run the Streamlit application:
streamlit run frontend/streamlit_app.pyThe frontend dashboard will automatically open in your default browser at http://localhost:8501.
- Open the Streamlit frontend.
- In the sidebar, enter a Research Topic (e.g., "Large Language Models", "Quantum Machine Learning", "Retrieval-Augmented Generation").
- Adjust the Max Results (how many papers to fetch) and the Similarity Threshold (minimum similarity score to form a connection in the graph).
- Click Run Analysis.
- The system will process the papers and display the Extracted Papers, their Summaries, extracted Claims, and an Interactive Knowledge Graph visualization.
This project is licensed under the MIT License, see the LICENSE for more details.
