As the world is witnessing a growing interest in building corpus-based/statistical NLP tools and applications, Arabic faces a critical problem in building such tools because of its lack of language resources
It is very weall known that the absence of diacritics in Arabic is one of the challenges that faces Arabic NLP. Accordingly, this corpus was built by the Arabic Language Technology Centre (ALTEC) (www.altec-center.org) as language resource for Arabic in order to support research in Natural Language Processing.
Corpus Diacritized (Commercial 1500$- Academy 150$)
Academic version contains the basic version of the databases, and the commercial version contains the full version of the databases.
Egyptian companies listed on the ITIDA and academia interested in the full version will enjoy 20% discount.
ALTEC members enjoy 10% discount.
Related files to Diacritized Corpus