Description
Description: This dataset contains a collection of 2,000 emails, specifically curated for the purpose of validating machine learning models designed to differentiate between safe emails and phishing attempts. The dataset is a mix of real-world email samples and artificially generated emails, ensuring a comprehensive reflection of realistic email scenarios.
Each entry in the dataset includes the full text of an email and a corresponding label that categorizes the email as either 'Safe Email' or 'Phishing Email.' This dataset is intended for use in validating the performance of models after they have been trained, providing a crucial step in ensuring the model's accuracy and reliability before deployment.
Dataset Structure: Total Emails: 2,000
Email Types:
Safe Emails
Phishing Emails
Attributes:
Full text of the email
Label indicating whether the email is safe or phishing
Example Entries:
Email Text: "Dear Jordan, your subscription has been succes..."
Email Type: Safe Email
Email Text: "Congratulations! You've won a $3000 gift card...."
Email Type: Phishing Email
Acknowledgments The authors would like to thank Sofia Tech Park and the Artificial intelligence and CAD systems laboratory for their assistance and support in conducting this research.
Each entry in the dataset includes the full text of an email and a corresponding label that categorizes the email as either 'Safe Email' or 'Phishing Email.' This dataset is intended for use in validating the performance of models after they have been trained, providing a crucial step in ensuring the model's accuracy and reliability before deployment.
Dataset Structure: Total Emails: 2,000
Email Types:
Safe Emails
Phishing Emails
Attributes:
Full text of the email
Label indicating whether the email is safe or phishing
Example Entries:
Email Text: "Dear Jordan, your subscription has been succes..."
Email Type: Safe Email
Email Text: "Congratulations! You've won a $3000 gift card...."
Email Type: Phishing Email
Acknowledgments The authors would like to thank Sofia Tech Park and the Artificial intelligence and CAD systems laboratory for their assistance and support in conducting this research.
Date made available | 29 Aug 2024 |
---|---|
Publisher | Zenodo |