Computer Ethics - Philosophical Enquiry (CEPE) Proceedings

Conference Section

Data Ethics

Publication Date


Document Type





Web harvesting and archiving pertains to the processes of collecting from the web and archiving of works that reside on the Web. Web harvesting and archiving is one of the most attractive applications for libraries which plan ahead for their future operation. When works retrieved from the Web are turned into archived and documented material to be found in a library, the amount of works that can be found in said library can be far greater than the number of works harvested from the Web. The proposed participation in the 2019 CEPE Conference aims at presenting certain issues related to the existing legal framework as well as technical/librarianship issues that apply to Web harvesting and archiving. The aforesaid proposed conference participation will elaborate upon the applicable legal framework with the aim to shed light upon what is legally sound and what is not in relation to web harvesting techniques and processes. It will also elaborate upon technicalities of TDM leveraged for the implementation of TDM. Currently, the EU Commission aims at promoting the efficient use of text and data mining (TDM) for scientific research purposes. Regarding TDM, the EU Commission opts for making Member States to provide for an exception to the rights provided for in article 2 of Directive 2001/29/EC, articles 5(a) and 7(1) of Directive 96/9/EC and article 11(1) of the proposed Directive for reproductions and extractions made by research organizations in order to carry out text and data mining of works or other subject-matter to which they have lawful access for the purposes of scientific research. Thus, regarding TDM in the EU legal environment, the new Directive considers text and data mining to be an exception to the reproduction right of Copyright aimed solely for research. The exception for scientific research can, in certain circumstances, cover the acts of reproduction performed in the course of data analysis activities even in the existing “acquis communataire” through the provision of article 5(3) of Directive 2001/29/EC. Web harvesting and archiving in Greek academic libraries is at its embryonic stage, currently, at least in consideration of article 4§4(b) of Law 4452/2017 which rules that the National Library of Greece is empowered with the right to deploy TDM in Greece and to oversee the deployment of TDM through other libraries. For academic libraries in Greece, the research upon the methods and applications for web harvesting as well as upon the policies related to said subject matter is of special interest. Legal stumbling blocks exist, both with respect to the data collection in the Web harvesting phase as well as to data sharing in the archiving and making available to the public of the Web harvested output.

Custom Citation

Kanellopoulou - Botti, M., Papadopoulos, M., Zampakolas, C., & Ganatsiou, P. (2019). Legal and technical issues for text and data mining in Greece. In D. Wittkower (Ed.), 2019 Computer Ethics - Philosophical Enquiry (CEPE) Proceedings, (19 pp.). doi: 10.25884/yp3n-dq78 Retrieved from https://digitalcommons.odu.edu/cepe_proceedings/vol2019/iss1/11