Twitter Data Collection

🧾 Overview

This folder contains datasets related to hate speech re/tweets collected from November 2020 to late 2021 using the Twitter API (prior to changes in Twitter's developer access policies). The data has been gathered from various organizations and committed individuals.

Dataset Details:

1) Included Information: 
  - Tweet IDs
  - Creation time
  - User creation time
2) Excluded Information: 
  - Full tweet text
  - User metadata

This is in accordance with Twitter’s [Developer Policy](https://developer.twitter.com/en/developer-terms/agreement-and-policy).

Privacy and Ethical Use Notice

To protect the safety and privacy of individuals involved in or affected by the war:

- Full tweet content, user handles, or personal metadata are not shared.
- The dataset excludes deleted tweets, private account data, or direct messages.
- Any research or reuse of the data must strictly avoid doxxing, targeting, or profiling users.

Please treat all research derived from this data with caution and sensitivity to the context of the Tigray War.

Accessing Full Tweet Content

To access the full tweet content:
1. Directly contact us.
2. Use tools like [Twarc](https://twarc-project.readthedocs.io/en/latest/) or [Hydrator](https://github.com/DocNow/hydrator).

Intended Use

This data is intended for non-commercial, educational, and academic research — particularly to support:
- Documentation
- Advocacy analysis
- Historical records of the global digital activism response to the Tigray War

We hope it supports future research, data literacy, and critical storytelling efforts within and beyond the Tigrayan community.

Data Management

This dataset is curated and maintained by the Tigray Data Repository Initiative. For questions, feedback, or requests for restricted-access data (e.g., for graduate-level research), please contact us at info@datafortigray.org. 