GeoGPT Frequently Asked Questions (FAQs)
Get quick answers to frequently asked questions about GeoGPT and clear up any doubts you may have.
About GeoGPT
1. What is GeoGPT?
GeoGPT is a specialized open large language model system developed to support geoscientific research and applications. It is inspired by the vision of Deep-Time Digital Earth (DDE), initiated at Yunqi Academy of Engineering, and innovated at Zhejiang Lab in collaboration with geoscientists globally. It is an open-source, non-profit global geoscience research project, aiming to promote the open science principles of collaboration, sharing, and co-construction.
It leverages Artificial Intelligence (AI) to analyze, interpret, and generate insights from extensive geoscientific datasets, assisting researchers, educators, and industry professionals in advancing earth sciences. GeoGPT is designed as a domain-specific tool that addresses the unique challenges faced in geoscience, combining global collaboration, advanced technology, and ethical frameworks to create an innovative solution.
2. What is the purpose of GeoGPT?
GeoGPT aims to address complex geoscientific challenges by offering innovative AI solutions, improving accessibility to data and knowledge, and fostering interdisciplinary collaboration to advance global research and innovation in the geosciences. The primary goal is to create an inclusive, transparent, and trustworthy platform that advances geoscience while adhering to ethical standards and open-access principles.
GeoGPT is designed and developed as a geoscience domain specific LLM system. It is tuned to address questions related to the geosciences. It is neither designed for, nor should it be used to explore issues related to political ideologies or social topics outside the realm of the geosciences. We suggest that there are other tools that the user can use for this purpose.
3. What is the relation between GeoGPT, Zhejiang Lab and DDE?
GeoGPT was developed under the leadership of Zhejiang Lab, and inspired by the vision of the DDE.
Zhejiang Lab and DDE are independent organizations, with GeoGPT drawing inspiration from DDE while maintaining full legal and operational autonomy. Based on the recently signed formal Memorandum of Understanding (MoU), both organizations have established a framework for collaboration that ensures transparency, accountability, and mutual benefit. This partnership is built on a commitment to responsible governance. DDE has taken steps to strengthen its oversight mechanisms, while GeoGPT's Governance Committee will continue to assess and guide the collaboration to ensure alignment with ethical and strategic priorities. While GeoGPT and DDE will leverage each other's strengths—combining DDE's geoscience expertise with GeoGPT's AI-driven innovations—GeoGPT will remain an independent initiative with its own governance and decision-making authority. The MoU formalizes this relationship, providing clear guidelines for cooperation while safeguarding the integrity and autonomy of GeoGPT.
4. What Will Be Open in GeoGPT?
GeoGPT's commitment to openness aligns with best practices in scientific AI development while ensuring quality control, attribution, and responsible governance. The following components will be openly available:
- Model Weights – GeoGPT will release the corresponding model weights for relevant usage scenarios.
- Training Data Transparency - GeoGPT will publicly disclose its training sources (licensed geological literature, open-access datasets, curated scientific publications).
- Community Contributions – GeoGPT will encourage collaborative model improvement, enabling researchers to fine-tune versions for specific geological subfields.
Objectives and Goals
5. What are GeoGPT's primary objectives?
- AI Innovation: Develop advanced AI solutions to solve geoscientific challenges, such as geological modeling, climate analysis, and geohazard prediction.
- Ethical AI Development: Promote responsible, fair, and transparent use of AI tools within geosciences.
- Collaboration: Facilitate interdisciplinary scientific dialogue across academia, industry, and governments.
- Educational Advancement: Support geoscience education by creating tools for researchers, educators, and students.
- Open Access: Commit to use open-access data, fostering transparency, and supporting FAIR (Findable, Accessible, Interoperable and Reusable) data principles.
6. What are the long-term goals of GeoGPT?
GeoGPT's long-term vision includes:
- Global Dataset Expansion: Curating comprehensive, globally representative geoscience datasets to reduce data gaps and biases.
- Technological Advancements: Enhancing model capabilities by integrating Retrieval-Augmented Generation (RAG),reasoning and other AI innovations.
- Sustainability: Minimizing environmental impacts through efficient computational practices.
- Community Building: Establishing stronger partnerships with geoscience societies, publishers, and educational institutions to ensure mutual benefits.
- Supporting Global Challenges: Assisting in areas like resource management, climate resilience, and sustainable development goals (SDGs).
Data and Technology
7. How GeoGPT models are trained? And what training data has GeoGPT used?
The training process for the GeoGPT models consists of three key stages:
- Continual Pre-training (CPT):The continual pre-training stage utilizes a variety of datasets, including open access papers/books, and geoscience-related materials from Wikipedia and Common Crawl that grant rights for non-commercial LLM modeling training and model output sharing. This extensive corpus is used to ensure a solid foundation in geoscience literature.
- Multi-stage Supervised Fine-tuning (MSFT): The fine-tuning process is divided into three stages to refine the model's performance, using curated open-source QA pairs, the filtered Tulu-v3 dataset, geoscience-related QA data and long-context data.
- Direct Preference Optimization (DPO): The final stage, direct preference optimization, involves refining the model based on prompts paired with preferred answers. This ensures that the model's responses align more closely with user expectations and preferences.
These detailed processes ensure that GeoGPT is trained on a wide range of geoscience data, fine-tuned for specific tasks, and optimized for user preferences. All the data used for training comes from open-access publications with permission to train non-commercial LLM and share model output, and the data sources along with how we utilize them are detailed in the data file.
8. Will the training data sources be publicly shared?
Yes, GeoGPT is committed to transparency and will publicly share information about its training data sources.
GeoGPT aims to foster trust within the scientific community by openly documenting the origins of its knowledge base. This commitment aligns with its goal of setting a new standard for openness in AI-driven geoscience research.
By combining a diverse, high-quality training corpus with a transparent approach to data documentation, GeoGPT sets itself apart as a leader in responsible AI development for geoscience. Its ongoing efforts to expand and update its knowledge base ensure it remains a reliable tool for researchers worldwide.
9. What foundational models does GeoGPT use?
GeoGPT integrates leading open-source foundational models, such as:
- LLaMA
- Mistral
- Qwen
- DeepSeek
These foundational models are adapted to meet geoscience-specific needs through domain-specific fine-tuning and data curation. By leveraging diverse foundational models, GeoGPT ensures:
- Inclusivity: Representation of global technological advancements.
- Fairness: Reduction of biases through diverse data sources.
- Performance: Integration of state-of-the-art AI models to meet high geoscientific standards.
In future releases, we will incorporate other open-source LLMs as part of GeoGPT's foundation models to allow diversified and balanced results.
10. How does GeoGPT provide adequate attribution of the work cited in the response to a prompt?
GeoGPT ensures proper attribution by using retrieval-augmented generation (RAG) techniques and structured metadata to accurately reference source material. It includes:
Explicit Citations: Author names, publication details, and links to original sources are provided when referencing publications, datasets, or figures.
Automated Source Tracking: A traceable record of sources is maintained to avoid misattribution.
Context-Aware Attribution: Primary contributors are acknowledged when synthesizing knowledge from multiple sources.
GeoGPT collaborates with the geoscience community to refine citation practices and aims to set a new standard for AI-driven attribution, which helps ensure transparency and trust.
GeoGPT aims to set a new benchmark for AI-generated attribution. By prioritizing accurate, automated, and transparent citation mechanisms, GeoGPT fosters trust within the geoscience community and ensure that contributors receive the recognition they deserve.
Governance and Ethics
11. What is the GeoGPT Governance Committee, and why is it important?
The GeoGPT Governance Committee is an independent body tasked with providing strategic oversight and guidance to ensure GeoGPT adheres to ethical, legal, and operational standards. It plays a critical role in:
- Ethical AI Development: Promoting unbiased and fair AI practices.
- Transparency: Ensuring open governance and public access to decision-making processes.
- Accountability: Overseeing compliance with global regulations and ethical frameworks.
- Stakeholder Collaboration: Engaging geoscience communities, industry partners, and the public to build trust and inclusivity.
- The GeoGPT Governance Committee Terms of Reference (ToR) is attached to the end of the FAQ.
12. What are the responsibilities of the Governance Committee?
The Governance Committee's core responsibilities include:
- Oversight and Reporting: Monitoring compliance with ethical and legal standards.
- Risk Management: Identifying and mitigating risks, including data misuse and ethical breaches.
- Strategic Direction: Advising on GeoGPT's long-term goals and initiatives.
- Stakeholder Engagement: Facilitating partnerships with geoscience publishers, societies, and international organizations.
- Transparency Mechanisms: Ensuring the publication of governance reports, audits, and reviews to foster community trust.
13. How does GeoGPT address ethical and censorship concerns?
GeoGPT adheres to international ethical AI standards and maintains independence by:
- Law abiding: GeoGPT must follow the laws of every sovereign country. Other LLMs follow the same basic practice.
- Global Model Selection: Offering multiple foundational models (LLaMA from USA, Mistral from Europe, QWEN and DeepSeek from China) to ensure inclusivity and neutrality.
- Jurisdictional Compliance: Ensuring that GeoGPT's scientific focus remains free from political concerns outside specific jurisdictions.
- Transparent Policies: GeoGPT's data usage, governance, and outputs are openly documented.
- Focus on Science: GeoGPT avoids topics unrelated to science to maintain its integrity and primary purpose.
14. How will GeoGPT avoid the moral and ethical hazards which concern scientists and the public regarding AI initiatives?
GeoGPT is committed to responsible AI development, addressing ethical concerns like bias, misinformation, transparency, and misuse through a multi-layered approach and robust governance. Here's how:
- Ethical Safeguards
- Scientific Integrity: Outputs are rooted in peer-reviewed geoscience sources, ensuring accuracy and preventing misinformation.
- Attribution: A built-in citation system credits researchers, respecting intellectual contributions.
- Bias Mitigation: Multiple foundation models from diverse regions (U.S., Europe, China) minimize regional or institutional biases.
- Non-Political Stance: Limited to geoscience, GeoGPT avoids political or ideological topics, maintaining focus and neutrality.
- User Privacy: No tracking or data retention protects user security and trust.
- Governance Framework
A dedicated Governance Committee ensures accountability:- Ethical Oversight: Aligns with international AI standards (e.g., UNESCO, EU AI Act).
- Transparency: Open decision-making on updates and policies.
- Scientific Review: Experts validate outputs for reliability.
- Risk Management: Regular assessments adapt to emerging concerns.
- Leadership for Ethical AI in Geoscience
GeoGPT sets a standard for Ethical AI in Geoscience by:- Aligning with global ethics principles.
- Engaging the geoscience community.
- Staying independent from political or commercial influence.
Collaboration and Use Cases
15. How can stakeholders participate in GeoGPT's development?
Engage through workshops, research initiatives, mutually beneficial collaboration or validation reviews.
16. Who are the key collaborators in the GeoGPT program?
GeoGPT thrives on global collaboration and partnerships, welcoming contributors from every continent. Key collaborators include:
- Academic Institutions: Universities and research institutes worldwide that contribute expertise and authorized data.
- Geoscience Societies and Publishers: Ensuring access to high-quality, trusted geoscience literature and FAIR data.
- Industry Partners: Organizations across the globe leveraging GeoGPT for exploration, resource management, and environmental applications.
- Global Experts: AI and geoscience professionals who guide GeoGPT's development and implementation.
17. How can stakeholders collaborate with GeoGPT?
GeoGPT welcomes global collaboration through:
- Research Initiatives: Partnering on interdisciplinary studies and AI-geoscience innovation.
- Content Licensing Agreements: Establishing mutual trust to use licensed datasets while respecting intellectual property.
- Workshops and Knowledge Sharing: Providing platforms for discussions on AI applications in geoscience.
- Community Engagement: Inviting contributions to dataset validation, ethical reviews, and global representation.
18. What are some practical applications of GeoGPT?
GeoGPT addresses diverse geoscientific challenges, such as:
- Geological Modeling: Analyzing geological data to improve resource exploration and management.
- Climate Research: Supporting climate change modeling and sustainability initiatives.
- Geohazard Prediction: Assisting in predicting and mitigating natural disasters, such as earthquakes and landslides.
- Education and Training: Providing user-friendly tools for students, educators, and policymakers to enhance geoscience literacy.
- DeepResearch: Hypothesis generation for uncovering new insights in geoscientific studies and fostering innovation.
GeoGPT's versatility extends beyond the applications listed above, as it can provide solutions to a wide range of geoscientific and real-world challenges.
Transparency and Public Engagement
19. How does GeoGPT ensure transparency and build trust?
Transparency is at the core of GeoGPT's mission. It ensures:
- Open Reporting: Regular updates on data usage, governance decisions, and ethical audits.
- Stakeholder Access: Providing open forums and workshops for community input.
- Collaborative Oversight: Partnering with publishers and societies to validate data quality and ethics.
- Open-source Data and Program: Making datasets and tools publicly accessible to foster global collaboration and innovation.
GeoGPT ensures accountability, encourages global involvement, and builds trust through share cooperation and transparency.
20. How does GeoGPT ensure data trustworthiness?
- Clear Data Disclosure: Publicly sharing data source composition, ensuring users understand model inputs.
- Collaborative Validation: Partnering with geoscience publishers and societies to curate trusted, high-quality datasets.
- Bias Monitoring: Implementing regular audits and reviews to identify and address biases in the model outputs.
- Open Governance: Establishing an independent Governance Committee to monitor transparency and guide ethical operations. The Governance Committee has already appointed two world-class geoscientists as its co-chairs, is soliciting committee members, and will hold its first meeting in London in March 2025.
21. How does GeoGPT address international concerns about independence?
GeoGPT's independent Governance Committee includes global experts who ensure:
- Neutral Oversight: Non-biased guidance on governance and strategy.
- Compliance: Adherence to international ethical, legal, and operational standards.
- Transparency: Full accountability in decision-making processes and data reporting.
- Diverse Representation: Include experts from various regions and geoscience disciplines to ensure balances perspectives and inclusivity.
- Conflict Resolution Mechanism: Establish clear protocols to address disputes or concerns, ensuring fair and impartial resolutions.
22. What measures are in place to prevent misuse of GeoGPT?
GeoGPT integrates safeguards such as:
- Usage Monitoring: Detecting and addressing unauthorized or unethical usage.
- Ethical Frameworks: Strict guidelines for ethical AI implementation.
- Stakeholder Reviews: Regular reviews and feedback loops to mitigate risks.
- Education and Awareness: Promote responsible AI practices by providing comprehensive training programs.
23. Is GeoGPT censored or politically driven?
GeoGPT is designed to promote science and support researchers, scientists and industrial practitioners worldwide in the field of geoscience. GeoGPT adheres strictly to ethical AI standards, ensuring it remains open, and unbiased, in its operation and outputs. Its focus is purely on advancing geoscience knowledge.
Our policy and user agreement are highlights below:- Responsible Use: We strongly discourage using GeoGPT for hidden motives or for processing sensitive personal information. We recommend that users avoid entering personal details (e.g., address, bank account, passwords etc. ) that they don't wish to be reviewed.
- Read the User Guide and Agreement: We encourage you to review the User Guide and Agreement during sign-up. For full details, please refer to our User Agreement.
Data Usage and Privacy Concerns
24. Does GeoGPT use my content to improve model performance?
Currently, GeoGPT does not use users' uploaded data for model optimization. Should we need to use users' data in future, we will ask for user consent.
25. Is my personal information shared with third parties?
Emails and phone numbers are collected and solely used for registration purposes to ensure account safety, security.
We do not share personal information unless user's explicit consent is obtained.
26. How do we handle registration terms, account creation, and deletion?
Our platform is built with data privacy at the forefront of every user interaction—ranging from registration to account deletion. Here's how we handle each aspect:
- Registration Terms:
- Transparency and Clarity: Our registration terms are written in plain language so users understand what personal data is collected, how it's used, and with whom it may be shared.
- Explicit Consent: We require users to actively agree to our terms and privacy policies before proceeding, ensuring informed consent right from the start.
- Regulatory Compliance: Our terms are regularly reviewed and updated to align with data protection laws like GDPR, CCPA, and other relevant regulations.
- Account Creation:
- Minimal Data Collection: We adhere to the principle of data minimization, collecting only the essential information needed to create and manage an account.
- Secure Data Handling: During account creation, all personal data is transmitted and stored securely using encryption and other best-practice security measures.
- Privacy by Design: Our systems are developed with privacy integrated from the start, ensuring that data protection is a key part of our infrastructure.
- Account Deletion:
- User-Controlled Process: We provide users with a clear and simple process to request account deletion, which reinforces our commitment to user autonomy.
- Compliance and Audit: Our deletion procedures are designed to meet regulatory requirements, and we perform regular audits to verify that our practices adhere to both internal policies and external legal standards.
27. Can I delete all my chat history?
Yes, users can clear and delete chat conversations or their entire chat history by clicking the delete button in the Chat.
28. Where is my content stored?
GeoGPT operates with a distributed infrastructure to ensure secure, reliable, and internationally compliant access for users worldwide. Content are stored on secure servers and systems managed by us and our trusted service providers, subject to strict confidentiality and security obligations.
Currently, GeoGPT also maintains a server in Singapore to support international users, with additional infrastructure planned to enhance global accessibility and compliance with regional data protection regulations.
29. Is my content shared with third parties including the Chinese Government?
Your content stays private and secure with GeoGPT.
GeoGPT does not share your content with third parties, including the Chinese Government, unless legally required by law and internationally accepted standards. We prioritize your privacy and data security with these key practices:
- Minimal Data Collection: We only gather basic usage metrics to improve performance—no personal data or detailed logs are collected.
- No Tracking: We don't use IP tracking or device fingerprinting.
- Secure Storage: Data is stored on distributed, multi-region servers to comply with global privacy laws. Data from outside China stays outside China on our international servers.
- No Unauthorized Sharing:Your data isn't shared with governments or companies unless mandated by law.
- Global Standards: We follow GDPR, CCPA, and other regulations to protect your privacy.
30. Does GeoGPT sell my data?
No, we do not sell your data to third parties.
31. Do humans view my content?
A limited number of authorized personnel and trusted service providers may access your content only for specific reasons:
- Investigating abuse or security incidents.
- Providing account support.
- Handling legal matters.
All access is logged, monitored, and restricted to authorized personnel on a need-to-know basis.
32. How do I submit a data privacy request?
You can email GeoGPT at
support.geogpt@zhejianglab.orgFuture Directions
33. What are GeoGPT's future plans?
GeoGPT will focus on:
- Expanding dataset diversity.
- Enhancing technical capabilities.
- Strengthening global partnerships.
- Extensive partnership with scientific societies, research institutions, and publishers.
Disclaimer
GeoGPT is a non-profit, open-source tool committed to transparency; it does not collect, sell, or share personal data, align with political agendas, or have hidden objectives. Use is entirely at your discretion.