§1.0. Introduction
§1.0.1. In recent decades, the creation of 3D digital representations of cuneiform tablets has increased significantly. In addition to the effort of digital preservation in museums (Robson et al. 2012) and the compilation of corpora for individual language groups (Chiarcos et al. 2018), the motivation lies primarily in the analysis of the paleographic and linguistic features of the tablets (Homburg 2021). In addition to radiometric measurement (Sánchez-Aparicio et al. 2018), a large part of these features is represented as geometry, which is why high technological demands are placed on the recording of the surfaces. Various technologies are available for the acquisition of 3D geometries. The quality of the results is strongly dependent on the hardware, the measurement setup and procedure, the operators’ experience, and the data’s subsequent processing (Polig, Hermon, and Bretschneider 2021).
§1.0.2. For the evaluation of the digital copies based on scientific methods, it is necessary to associate observations and their interpretation with individual areas of the recorded 3D geometry. Statements about features thus become referenceable and comprehensible and are available for critical examination. Annotations (Ponchio et al. 2020) have been established for this process for textual and graphical sources, e.g., exemplified in the Welscher Gast Digital project.[1]
§1.0.3. Annotations open up the purely numerical representation of cultural heritage objects for their scientific evaluation. It is essential to distinguish between two levels when modeling the annotation. The annotation object, i.e., the technical record of the reference to the digitized object and the conceptual modeling of the statement about the reference. Suppose a sign on a cuneiform tablet is annotated because of a paleographic feature. In that case, the digital representation of the section of the surface with the sign is created on the one hand and the statement about the details of the features on the other. Annotations are thus closely linked to the intended result of the analysis.
§1.0.4. In this publication, we focus on processing and annotating data acquired by means of a structured light scanner and present a workflow developed in the DFG-funded project ”The Digital Edition of Cuneiform Texts at Haft Tappeh, Iran”.
§1.0.5. Workflows and criteria for the digital publication of 3D models of cuneiform tablets are investigated, and a particular focus is given to making the data available for machine learning approaches.
§1.0.6. In addition to addressing data format and acquisition issues, which are not discussed in detail here, we constantly align the workflow of processing and enriching with the metadata requirements. Every step requires metadata for implementation and generates information with which the metadata must be enhanced. For orientation and better comprehensibility, we show the framework of workflow and metadata in Figure 1. A detailed description of said workflow can be found in (Homburg et al. 2021).
§1.0.7. It has to be stressed that the conceptual parts of this paper lay the relevant base for preparing cuneiform tablets in their digital representation for machine learning and artificial intelligence tasks of Optical Character Recognition (OCR) in 3D data. The digital 3D-twin of a tablet provides a maximum of information about the shapes of cuneiform signs being valuable for further tasks in paleography.
§1.1. Open Access criteria for cuneiform 3D data and derivatives
§1.1.1. In our publication we present a workflow that allows the following criteria for 3D data to be shared on the web as open access. For us that entails first and foremost that data needs to be shared as FAIR data (Wilkinson et al. 2016), that is data should be findable, accessible, interoperable and reusable. Apart from the role of data repositories which need to ensure that data can be accessed by APIs and that datasets can get unique identifiers (URIs) to be better findable, other parts of the FAIR principles need to be ensured in the data itself. In particular, these concern the principles of reusablility and interoperability. To make 3D data reusable, metadata with a detailed provenance (principle R1.2) need to be provided. For us, this entails the definition of metadata of the creation of 3D scans and their further processing and with a given author attribution and licensing (principle R1.1). These metadata should be described with already existing vocabularies following the FAIR principles (principle I.2) and we follow this advice until there is not suitable established vocabulary. In these cases, we suggest vocabularies that may be adopted by the community as domain-relevant community standards (principle R1.3). We see the (5*) principles of linked open data[2] (Bauer and Kaltenböck 2011) as one way to achieve a fair data provision of 3D data, which allows the accessibility of metadata in standardized vocabularies, in a machine-readable, standardized way using RDF and SPARQL and with the possibility to link this data to other related relevant data sets (the idea of linked open data). Hence, all suggestions we follow in this publication are thought or implemented with the linked open data paradigm in mind.
§2. Processing and data products in 2D and 3D
§2.0.1. This section describes the processing chain from raw 3D data delivered by a 3D scanning software to data products that are considered as data publications: Optimized 3D meshes and 2D renderings.
§2.1. Unpacking the black box
§2.1.1. Most scanners use proprietary, device-specific software to acquire and process measurement data into meshed point clouds. The data are often organized by projects in folder hierarchies, which can only be processed by different software in a few exceptional cases. Specifications or documentation describing these data stocks are either not documented or only incomplete. Therefore, independent reproduction or reprocessing of the data is mostly not possible, and the development of criteria for a quality-oriented supply of 3D data must start on the basis of pre-processed and proprietary data.
§2.1.2. In the Haft Tappeh project, we were able to work with data scanned by a structured light scanner (SLS) (cf. Section 3.1) in the year 2018 in the museum of Haft Tappeh, Iran. Due to the aforementioned reasons, these 3D scans were not available with sufficient metadata. In a master’s thesis conducted in the Haft Tappeh project, (Mertens and Reinhardt 2020) we investigated the cuneiform tablet scans to identify information that may still be retrieved from the given scanning projects. It turned out that information about the measuring volumes used for the given 3D scans could be retrieved. However, due to missing protocols about the scanning setup, information about the calibration of the 3D scanner used in the scanning process was not retrievable. This was because, in the version of the scanning software used in 2018, an export of calibration information was not implemented. In addition, the information about the algorithms used to merge and/or further process the point clouds or 3D meshes is not known to the authors.
§2.1.3. Therefore, the only possible way of evaluating more information about the given 3D scans in the Haft Tappeh project was to reconstruct the data processing process by checking its results by scanning and comparing data of new objects with the original and an alternative scanning system in the controlled environment in the laboratories of the i3mainz.
§2.1.4. The Haft Tappeh project is, in this context, to be seen as one example out of many scanning projects in which - without a data processing and preservation concept - these metadata are usually not captured, and reproducibility of the scanning results cannot be achieved.
§2.2. Benefication of black box 3D data
§2.2.1. As there is virtually no open structured light 3D acquisition system, the acquisition process is a black box, represented by usually proprietary scanning software implementations. Some systems provide application programming interfaces (APIs), which allow to query and retrieve information about parameters used. However, those are often not available, and the quality of the triangular mesh has to be double-checked anyway. Quality in this section means the manifoldness and further topological and geometric properties relevant for the subsequent computing tasks. The following lists possible errors within a mesh’s structure, which may not be complete, and different software packages also provide additional properties. For our project, we primarily used the GigaMesh Software Framework[3] and investigated other packages like MeshDoctor[4] or pymeshfix[5] Most common error indicators are:
- Non-Manifold Edges, i.e., more than two triangles connected by an edge
- Faces with zero areas leading to numerical errors
- Border vertices indicating more or less small holes
- Number of connected components indicating small unwanted surface patches
- Ridges and vertices with extremely high curvature (Highly-Creased Edges or Spikes)
- Singular vertices and small tunnels indicating erroneous connections of the surface
- Self-Inter-sections of the surface (cf. Klein bottle for a synthetic example)
§2.2.2. The aforementioned software packages typically allow to remove and partially repair these defects within a mesh. Since the start of developing GigaMesh in 2009, a vast number of meshes from different 3D scanners have been processed, leading to insights on all sorts of illicit meshes created by the accompanying software packages. Note that none of the 3D file formats provide any means to ensure that a mesh fulfills any criteria of validity or quality. The cleaning, i.e., removal of illicit surface areas and their repair, is described (Mara 2012, Section 4.4.2, 121), which is a 7 and 4 step algorithm iteratively applied. These steps and their ordering are optimized to keep the changes to the 3D model minimal. Since the publication in 2012, further improvements have been made, and the latest version of the GigaMesh Software can be accessed as Open Source on Gitlab.[6]
§2.2.3. In the Haft Tappeh project, we decided to use GigaMesh and the following mesh processing steps:
- Polishing, i.e., removal of illicit structures and filling holes.
- Orientation within the local coordinate system.
- Computing technical mesh properties, i.e., metadata.
- Colorization of the mesh’s vertices using the MSII filter.
- Export to ”legacy” PLY.
§2.2.4. Orientation is crucial for rendering as the object will be shown in the proper upright position as it would be loaded on display. If the orientation is not stored, each time the dataset is examined, the orientation has to be set manually. Even if the task of orientation should only take a minute or two, considering collections of thousands of objects, this might amount to weeks of working time wasted. If the orientation is saved, this allows recomputing renderings which are pixel-ident. Pixel-ident means that a pixel generated from the same 3D model using one rendering style represents the same location on the 3D model as a pixel generated from the same 3D model using another rendering style. This allows annotations taken on different screenshots to be transferable.
§2.2.5. So the rendering can be done in batches for whole directories with hundreds of 3D models without any user interaction. This saves a lot of working time for publications of larger object collections. The result is, at the same time, data export and the curation of a catalog of numerous objects. It can be interchanged with the polishing step. Still, it is recommended to be performed second as the positioning is influenced by stray vertices and small connected components removed by the cleaning algorithms. See Section §2.3 for the technical metadata. As the Haft Tappeh project’s 3D datasets did not contain any color information, the result of the MSII filtering is encoded using grayscale as color per vertex, allowing for inspect this high-contrast visualization in any other software. The processing pipeline can be used with Structure from Motion (SfM). As colors are not relevant to our project, we chose SLS. GigaMesh adds PLY conform information, e.g., about synthetic vertices added during filling of holes, but many import functions of other software packages do not support the PLY standard. Only Meshlab (Cignoni et al. 2008) supports PLY import as defined by (Turk 1994) and therefore is capable of coping with custom data fields like flags for synthetic vertices. So we deliberately choose to publish ”legacy” PLYs, where no extra information is present. This allows the meshes to be imported to virtually all other programs. It also makes the technical metadata even more critical as, e.g., the amount of synthetic vertices is stored in those sidecar files.
§2.2.6. Mesh data can be optionally converted in any of the following data formats: (McHenry and Bajcsy 2008):
- STL (Stroud and Xirouchakis 2000)
- PLY (Turk 1994)
- OBJ (Wavefront 1996)
- X3D (Brutzman and Daly 2010)
- COLLADA (Arnaud and Barnes 2006)
- Nexus (3DHOP streaming)[7]
§2.2.7. In the Haft Tappeh project, we chose to deliver 3D meshes in the PLY format, which follows the best practices of previous cuneiform 3D model publications such as HeiCuBeDa (Mara 2019). The PLY format allows for great accessibility, as the format is well-established and supported by various software, such as web-based and desktop applications. In the future, the X3D or COLLADA might be a better choice, but at the time of writing, the 1990s PLY still offered the best compromise of versatility and support of all the 3D software available.
§2.3. Metadata (3D Processing)
§2.3.1. Due to the fact that at the beginning of the Haft Tappeh project, no vocabularies were available to describe the capture process of a 3D mesh in metadata, we proposed an appropriate standard and its implementation (Homburg et al. 2021). This allows for future scans or possible rescans of cuneiform artifacts to save data in a unified format. Broadly surmised, we distinguish metadata captured by the scanning software (the software delivered with the respective scanner being used) and metadata of the post-processing software (e.g., GigaMesh) used to clean or, in other ways, optimize meshes for later publication. Metadata obtained by the scanning software depends on the scanning software providing access to said metadata, either using an API or data exports. In to-be-developed processing, a step needs to be mapped and converted to the unified linked data format to be helpful. This allows a user and a machine to retrace the involved steps of 3D model creation (e.g., mesh cleaning, mesh orientation) Examples of 3D processing metadata include:
- Information about the applied scanning software
- Information on the scanning process
- Scanning sensors
- Algorithm to merge individual scanning results
- Time parameters on beginning/end of the scan
- Information about the scanner itself
- Information about post-processing steps, e.g., Mesh cleaning
- Information about the people involved in the creation of the 3D scan and their roles in the acquisition process
§2.4. Mesh Metadata (computed)
§2.4.1. Calculated meshes exhibit specific characteristics represented in metadata associated with the given mesh. These may be parameters about the size of the mesh, e.g., the number of points and the number of faces and further attributes, which may be computed by given processing software. Often, these parameters are not part of the metadata provided by the scanning software and need to be computed from values already exported previously. Examples of interesting metadata parameters which might be considered by Assyriologists and which are usually exported in either the scanning software or in the post-processing step of a scan are:
- width height and depth of the cuneiform 3D scan
- volume of the cuneiform tablet as captured by the 3D scan
- 3D-printing eligibility of the 3D scan
§2.4.2. While the first two parameters can be inferred from the 3D scanning software, parameters such as the 3D printing eligibility might need to be computed from parameters already given. These computations might be achieved by third-party scripts of post-processing software such as the ones shown in the context of Homburg et al. 2021.[8]
§2.5. Metadata (2D renderings)
§2.5.1. As renderings are exported in an image format such as JPG or PNG, they need to follow a metadata standard for images. We employed the metadata standard XMP (Tesic 2005) to append the following information to the annotated image resources:
- Author information according to the Dublin Core vocabulary (Initiative and others 2012):
- Author of the annotation (dcterms:creator)
- Creation date (dcterms:date)
- Subjects of the images (usually Cuneiform, Sign, Image) (dcterms:subject)
- Format of the image (e.g. image/jpg) (dcterms:format)
- License of the annotation (Usually CC0[9]) (dcterms:license)
- Information about the artifact:
- Tablet ID (dcterms:identifier)
- Tablet Surface Identifier
(Obverse,Reverse,Left,Right,Top,Bottom) (dcterms:identifier)
- Information about the software which created the rendering
§2.5.2. This metadata ensures that an exported 2D rendering can be traced back to its cuneiform tablet identifier and the software it was created from. Besides attaching the metadata to the image, we also provide an export in a linked data format such as TTL. This export may be used to query specific image data in a linked data database.
§3. Data products ready for dissemination
§3.0.1. In the Haft Tappeh project, we produce various data products gained from the previously described input data. We describe these data products, their metadata, and their publication in the following.
§3.1. 3D Meshes
§3.1.1. For the acquisition of the tablets in the Haft Tappeh museum, a Range Vision Spectrum 3D scanner generously provided by the Helmholz Institute Mainz was used. This kind of 3D scanner uses the principles of structured light and stereo vision, and its output is irregular triangulated meshes stored in the StereoLithography file format (STL) format,[10] which are later converted for publication in the Stanford Polygon Files (PLY) format (Turk and Levoy 1994). This particular 3D scanner does not provide color information as texture map nor color per vertex. In general, color acquisition by this type of 3D scanner is relatively rare as its primary use is surface geometry inspection for industrial tasks. Therefore grayscale cameras are built into those devices providing higher geometric accuracy than the more complex color cameras.
§3.2. Renderings
§3.2.1. Despite the digital twins of cuneiform tablets in 3D providing a maximum of details of their surface geometry, it is pretty practical to have regular raster images in 2D, i.e., renderings at hand. The most useful reason for this is that these raster images can be previewed easily by the file manager of any given operating system. Previews like this do not currently exist for 3D models. Additionally, an image loads much faster than a 3D model, and the box shape of the tablet makes it easy to provide automated renderings of all six sides. Therefore the rendering of fat-crosses was done, which adds compatibility to projects using photo cameras or flatbed scanners, as all surfaces of the cuneiform tablet are included in a fat-cross rendering. These are arguably the most interesting parts of Assyriology research and are often included in the CDLI repository. Using the GigaMesh Software Framework we apply the Multi-Scale Integral Invariant (MSII) filter to compute curvature-based renderings showing the imprinted (concave) wedges in dark gray, while raised (convex) details like sealings are shown in light gray (cf. Figure 3). At the same time, these images differ from photographs and renderings using a light source typically placed over the top left corner of a tablet. So the sides of the wedges are shown in different brightness levels depending on their orientation. Therefore we provide these renderings as a second set to be mixed and matched with preexisting images. Finally, the third set as 50:50 mixture of renderings using virtual light and MSII filter results are computed. Those are most useful and appealing for analysis by the human experts presenting the most minor details and the overall shape of the tablet, which is also missing in the flat-looking MSII renderings. At the same time, a uniform, purely mathematical derived visualization increases the exploratory comparison of the tablets and their features by humans. Concept-wise, the mixed renderings are published on platforms for human interaction like HeidICON as an easydb instance,[11] while the others are meant for computational analysis are published via Dataverses like HeiDATA[12][13] and for academia in the publication (Mara and Bogacz 2019).
§4. Annotations in 3D and 2D
§4.0.1. This section discusses annotations in 2D and 3D, which have been created for the aforementioned data products to be useful for machine learning purposes.
§4.1. Areas of interest
§4.1.1. After consultation with Assyriologists on our research project and international community, we identified the following areas of interest of a cuneiform tablet which are helpful to be annotated on a 3D Scan:
- Columns
- Lines
- Cuneiform Words (if available)
- Cuneiform Signs
- Individual wedges
- Broken parts of the cuneiform tablet
- Sealings and its pictorial descriptions and writings (currently subject of research as referenced by)
- Erasures of characters
- Rulings
- Firing holes
- Other pictorial elements
§4.1.2. Figure 2 shows an overview of all areas of interest highlighted on an abstract representation of a cuneiform tablet, except columns of cuneiform tablets and non-cuneiform style elements, which were not prevalent in the Haft Tappeh corpus. While annotations of characters may be considered most useful for machine learning tasks such as character recognition, all aforementioned annotation types can be useful for an automated classification and segmentation of cuneiform tablet contents.
§4.2. Representation of annotations in 2D
§4.2.1. Annotations in 2D are defined by areas of pixels defined on a given surface (here, the 2D renderings). For annotations in 2D, there are already established standards both in the semantic web and support in specific image annotation frameworks such as Annotorious[14] and Recogito (Simon et al. 2017). The arguably most widely used but definitely one of the few linked data based standards to annotate 2D images is the W3C Web Annotation model (Ciccarese, Sanderson, and Young 2017). This model defines annotations in linked data by employing an SVG selector, which may describe either a bounding box or a polygon area on the given 2D rendering or image.
§4.2.2. Using the W3C Web Annotation data model (Ciccarese, Sanderson, and Young 2017) as shown in Listing 1, it was possible to conduct tests of annotations on 2D in the course of the Haft Tappeh project. Figure 3 shows the 2D rendering of the front side of the 3D scan of the cuneiform tablet HT 07-31-95 (P530974) with annotated cuneiform signs. Each annotated cuneiform sign may be attached with information stored in the annotation body. For the data assembled alongside this publication, we considered adding line indexing information such as the line number, the character index, and the word index of a given character. Finally, we assign a sign reading and a character name according to the official Unicode cuneiform sign list to the annotation body and have the possibility to add a palaeographic description such as a Gottstein encoding (Panayotov 2015) or a PaleoCode (Homburg 2021) (cf. Table 1).
Sign | Name | Gottstein Code | PaleoCode |
MASZ | a1b1 | :/b-a | |
BAR | a1b1 | a;;b | |
LAL | a1b1 | //a#-sb | |
ME | a1b1 | a-:sb |
§4.2.3. Annotations in 2D work for many common to-be-annotated elements, but naturally face some problems when annotations exceed the boundaries of the given fat-cross segment they are mainly targeting (e.g., when a cuneiform sign is written around a tablet corner). While we have not considered this case in the implementation we use, solutions which allow the continuation of a 2D annotation across rendering borders are feasible and could be implemented in a modified annotation tool. In this publication, we show the potentials of 2D annotations on renderings in general.
§4.3. Transformation of 2D annotations to 3D
§4.3.1. The advantage of using 2D renderings from 3D scans in the Haft Tappeh project was the prospect of allowing a reprojection of 2D annotations to the originally created 3D models. If properly processed, e.g., with the GigaMesh software framework, the positions on the 2D rendering can be related to the positions on the 3D mesh, hence warranting a reprojection approach. The transformation from a 2D rendering to a 3D representation follows the following projection rules. At first, the points describing the 2D annotation need to be translated into the coordinate system of the 3D mesh. This is achieved in two steps:
- Rescaling of the x/y coordinates of the 2D annotation: Equivalent x/y coordinate pairs in the 3D model need to be calculated, taking the extent of the cuneiform tablet’s surface in its 3D space into account
- Z-Value Assignment: Z coordinate values need to be added to the rescaled 2D annotation
§4.3.2. While a rescaling approach is trivial if the maximum extents of the image and the 3D mesh surface are known,[15] the Z-coordinate assignment deserves some more attention, as the depth of the respective annotated cuneiform sign is not known without further analysis. The further approach depends on the annotation type to be generated. If a bounding box is to be generated, it is often enough to ascertain a fixed z value describing the extent of the bounding box. The value needs to be of such a depth that the bounding box completely encompasses every mesh point assigned to the cuneiform sign. The result of this assignment is eight coordinate pairs that describe a 3D bounding box in the mesh coordinate system, for example, shown in Figure 5.
§4.3.3. However, the so-created bounding box may still be used for further processing. It may be used to retrieve a maximum and minimum value for Z by examining the contents of the bounding box, and it may be used to extract all points which constitute the given cuneiform sign as point sets. The former may minimize the bounding box by forming a minimal enclosure, or it may lead to the creation of more precise annotations (annotation as a volume in 3D), such as shown in Figure 6. By more precise annotations, we mean an annotation not defined by a bounding box, but by a defined point set, which can precisely represent the structure of all cuneiform wedges belonging to the respective cuneiform sign which has been annotated. To extract the given point sets, we used the shapely python library.[16] It provides a method ”contains,” which tests if a point lies inside an object (here inside the bounding box).
§4.3.4. The final step is to create vertices and faces for the given pointsets. Vertices, in our case, are grouped as triangles to represent the faces in the mesh annotations we are trying to export. The goal is for these faces to form a coherent surface pattern, and we use the ear clipping triangulation algorithm by the library tripy[17] to perform the conversion for the front and back side of the annotation.
§4.3.5. Even though it is possible to apply the ear clipping algorithm to generate faces for the sides of the annotation, we chose a more simplified method under the assumption that the sides of the annotation depict a rectangle. This approach relies on the pre-calculation of the front and backside, which may be slightly more performant, but may be replaced by the ear clipping algorithm without loss of generality.
§4.3.6. To that end, researchers may create annotations on 2D rendering images in which cuneiform signs are, depending on the lighting conditions selected for rendering, sometimes better visible than on photos taken with a camera. When converted to 3D, these annotations may take different shapes, as described previously. We will discuss the advantages and disadvantages of these representations in the next section.
§4.4. Representation of annotations in 3D
§4.4.1. Contrary to representations of annotations in a 2D space, e.g., some part of an image described by a 2D polygon, the shape of a 3D annotation has to be suited to the purpose the annotation is trying to serve. For example, it needs to be considered if an annotation is used to mark an area on the surface of the cuneiform tablet or if the annotation is supposed to contain the area of the cuneiform tablet (i.e., the point set) which represents the cuneiform sign.
§4.4.2. For sign annotations: Should the annotation model individual cuneiform wedges, or is the area on which the cuneiform sign can be found the criteria to be considered? In the Haft Tappeh project, we tackled this question in an internship in the autumn of 2021 and came to the following conclusions:
§4.4.3. At first, given a 3D model of a cuneiform tablet such as HT 07-31-95 (P530974) (cf. Figure 4), annotations can be created in two fundamental ways. The first way is the creation of minimum bounding boxes around the areas of a 3D model in which a cuneiform sign or another area of interest is found (cf. Figure 5). Bounding boxes may also be used to describe single cuneiform wedges. These first two methods can be described as possible third-party annotations, as these kinds of annotations may also be associated with an already published 3D model. In a sense, these annotations may be external.
§4.4.4. A second way of describing annotations is to mark (or label) areas in a given 3D model. A label is hereby associated with a point set, which it identifies. Labels are then associated with annotation contents, as shown in Figure 6.
§4.4.5. Both representations have their distinct advantages and disadvantages. The first representation is suitable for highlighting specific points of interest on a given cuneiform tablet and giving machine learning approaches a sufficient hint as to where interesting elements are located. The annotations can be performed by a third party after the initial 3D model has been published and can be saved independently of the 3D model. This allows for greater flexibility in deciding who is allowed to annotate and how annotations may be stored. Labeling areas in a given mesh is an exact method of describing annotations, but the visualization of these annotations needs to be supported by the respective 3D viewing software. This is not an issue with the first way of annotation, as we can expect every 3D viewer to view many 3D models at once, considering every annotation constitutes its 3D model.
§4.5. 3D annotations of individual wedges
§4.5.1. Identical to creating annotations of cuneiform signs in the aforementioned example, annotating individual wedges of cuneiform signs is another possibility, which can be used for machine learning to recognize wedges as building blocks of signs. We assume that a sufficient amount of annotated wedges will lead to a more precise sign discovery being helpful to register and analyze the variations a sign can have. Figure 7 shows the annotations created on the 2D rendering akin to the annotations made in Figure 3 using exactly the same setup.
§4.5.2. Reusing our annotation setup, the transformation of the annotations from 2D to 3D as practiced for annotations of characters can be adapted for the transformation of single wedge annotations.
§4.5.3. To achieve wedge annotations in our test data, we chose to follow a line art description of the cuneiform tablet we selected for wedge annotations. Line arts are one quality-assured source of information by scholars, which help massively in the annotation process but also has certain drawbacks which we need to highlight beforehand.
§4.5.4. Figure 8 shows that manually created line arts, automatically created line arts, and contents that can be observed on a 2D rendering of a 3D mesh, may deviate in details. In this particular case, one overlapping wedge is shown in a different way in all three representations. The reason for such deviations is often a matter of interpretation by the scholar who creates the corresponding line art.
§4.5.5. Figure 9 shows the line art of tablet HS 1174 (P134485), which we chose as our test case for wedge annotations. Examples of deviations of the line art representation and the wedges actually found on the cuneiform tablet are highlighted in red. These deviations either miss wedges visible on the 3D scan which are not present in the line art or wedges present in the line art which cannot be found on the 3D scan.
§4.5.6. We understand that these deviations often do not disturb the process of transliterating a cuneiform tablet correctly, but we have to remark that with a given interpretation of the individual scholar creating the line art, the discovery of characteristics of a certain writer or of certain cuneiform sign variants may be lost when scholars interpret sings of the cuneiform tablet to a greater extent. A cuneiform 3D rendering does not interpret. It shows the scanning process results, and an annotation on the cuneiform tablet will only annotate what can be seen on the cuneiform tablet. This is necessary to be an authentic and consistent basis for OCR approaches and to potentially recognize, systematize, and categorize sign variants and do proper paleographic research. In conclusion, a wedge annotation on a 3D scan or a 2D rendering does not necessarily substitute a line art. Still, it may be another useful form of visualization for machines and people alike.
§4.5.7. If wedges are annotated, as is the case in Figure 10, these wedges can be automatically colored according to the types of wedges described by the Gottstein system. We explain the color coding in Figure 10 and furthermore distinguish the Winkelhaken wedge as an additional wedge type ’w’. This color-coding, however, should not be understood as advice or a rule for color-coding cuneiform wedges. It is merely a way of highlighting the results of an algorithm’s automatic recognition of wedge types. The highlighting in a technical sense is achieved by defining label IDs in the underlying PLY file format, which are in turn converted to color codings in the GigaMesh software. While highlighting and color codings are possible to be represented in other data formats, it is, to the author’s knowledge to date, not possible to show any of these highlights in software other than GigaMesh.
§4.5.8. GigaMesh interprets any additional information given to any coordinate in a PLY file in the form of a number, so that GigaMesh might present these points in different colors according to their color code. In essence, the id is used as a mapping to a corresponding Gottstein code, and it is calculated on the 2D annotation in the conversion process in 3D. It could also be calculated on a 3D annotation using a modified algorithm.
§4.5.9. To calculate the code in 2D, a character is considered a Winkelhaken if the annotation has four or fewer points. In this case, the annotation gets Gottstein Code’ w’. For any other Gottstein type, the angle of a straight line between the furthest points of one annotation and the x-axis is considered and results in the following Gottstein representations:
- Angle >70 and <110 degree: Gottstein Code ’a’
- Angle >=160 and <=180 degree or >=0 and <=20 degree: Gottstein Code ’b’
- Angle >=110 and <=145 degree: Gottstein Code ’c’
- Angle >=146 and <=165 degree: Gottstein Code ’d’
§4.5.10. We want to remark that this classification was used as a test, and that this algorithm is not necessarily perfect as a recognition for Gottstein codes, especially when wedges are very small, and could be improved upon. It should be seen as a demonstration of the detection capabilities of additional information once a wedge annotation has been successfully created.
§4.5.11. In addition, each annotated wedge could be assigned another ID, the ID of the bounding box, which marks the cuneiform sign it belongs to. This makes it possible, e.g., for machine learning algorithms to derive a connection between wedge locations and cuneiform signs for better classification purposes and to incorporate different wedge variant representations as machine learning features.
§4.5.12. In our example Figure 10, the wedges have been automatically colored according to the Gottstein system (Panayotov 2015). This means that it might be a feasible future approach for Assyriologists to repaint individual wedges of a cuneiform 2D rendering to mark significant character variants in a digital edition workflow. The marked variants could then be automatically converted into a Gottstein code or with further modifications, possibly also into a PaleoCode to present a machine-readable representation for machine learning tasks.
§4.6. Data formats, annotation and software compatibility
§4.6.1. Now that we have introduced how annotations can be created in principle, we will match the predefined annotation types to data formats. If 3D annotations need to be defined, there needs to be a data format to save these annotations.
§4.6.2. In general, three different approaches to saving 3D annotations come to mind. The first one is to save the annotation as part of the 3D mesh itself, i.e., to create annotation labelings inside a given mesh, as shown in Figure 10. This can be achieved in standard data formats such as X3D, PLY, and OBJ as part of the 3D model. If annotations are saved in this way, the following particularities apply:
- A 3D viewer needs to be able to highlight labelings on 3D models, to support a color-coding of these, and to support to show the annotation contents
- Annotations created in this way have to be saved in the 3D model itself, i.e., these annotations are not applicable to already published 3D models with a digital object identifier (DOI) (Paskin 2010) unless these are republished. Already published 3D models with a DOI cannot be changed in their current form, and republishing a 3D model might entail legal consequences which may not be overcome by the scholar annotating the 3D mesh.
- Sharing of annotation contents is not possible without sharing the 3D mesh as well
§4.6.3. The second one is the possibility to save the annotations as separate 3D models, i.e., separate files in the format of choice. The result is one mesh file for the original 3D scan and many mesh files representing the 3D annotations. This approach may produce many individual 3D annotation files, depending on the number of individual features to be annotated on the respective cuneiform tablet (e.g., the number of characters to be annotated). However, it should not be a problem for any 3D viewer to view and overlay many 3D models at once. The following properties can be observed when annotating as separate 3D models:
- Linking of the original 3D mesh with the 3D annotations
- A 3D viewer needs to be able to show annotation bodies supplied with the annotation data
- Anyone may create new annotations hosted anywhere and may be loaded together with the original mesh
This method overcomes the need to modify the original mesh, but its disadvantages lie in loading many meshes. However, this method is supported by most if not all 3D viewers currently available.
§4.6.4. The third approach uses the Web Annotation data model to represent the 3D annotation. This method allows storing the annotation separately from the given 3D mesh. However, the mesh is referenced using a linked data vocabulary. Using the web annotation data model provides the following advantages:
- Annotations are stored in one sidecar file, which might be loaded together with the 3D mesh
- Annotations can be created after a mesh has been published with its DOI because annotations are independent of the mesh publication
- Annotations are created in an established semantic web vocabulary and can therefore be easily included in and queried with methods of linked open data (Bauer and Kaltenböck 2011) such as SPARQL (Harris and Seaborne 2013)
§4.6.5. Despite the advantages of using the web annotation data model, its usage requires the definition of a new selector for 3D data. Listing 2 shows such a modified selector using the Well-Known-Text format (Herring and others 2011), which allows for the definition of a 3D annotation.
§4.6.6. While we have tested these annotations in a practical project, the disadvantage of this kind of annotation currently lies in missing software support. Current 3D viewers, except for possibly 3DHOP,[18] cannot parse the newly defined 3D selector annotations because they have not been standardized yet. Nonetheless, we see potential in defining and perhaps refining a 3D Selector definition for the Web Annotation Data Model to join 3D annotation data with similar 2D annotation data and combine it into the cuneiform linked open data cloud.
§4.6.7. Finally, data formats might allow the definition of 3D annotations within its data model. In the Haft Tappeh project, we have identified the X3D format as the only representative (Yu and Hunter 2014) that allows embedding annotations. In addition, formats such as COLLADA 3D[19] allow for the attribution of specific limited metadata fields to label components.
§4.6.8. Therefore, we currently recommend publication annotations either within the framework of the X3D model or using a web annotation model approach that is to be defined but which might rely on definitions in Well-Known Text (WKT).
§4.7. Sign images with metadata extracted from annotations
§4.7.1. Depending on the nature of the annotations given, metadata must be defined for the shared annotations. This section discusses how metadata should be prepared to suit the data format with which it is being served.
§4.7.2. If the metadata of 3D annotations is to be exported, the metadata associated with the 3D model should mimic the metadata used for the description of the 3D models. However, the provenance of the given 3D annotation needs to be clear on the origin of the annotation, i.e., it should link to the metadata provided by the 3D model. Suitable export formats we used in the Haft Tappeh project are TTL (Beckett et al. 2014), XML, and JSON-LD (Champin, Kellogg, and Longley 2020) serializations.
§4.7.3. Metadata of annotations gained from renderings (Figure 11) are to be represented in the metadata formats TTL and XMP.
§4.7.4. While XMP can be used to create metadata (Baca 2016) associated directly with the image, TTL represents the similar metadata in RDF used in a linked data graph representation describing virtual instances of every annotation. In any case, we expect the metadata of 2D annotations in any format to include the following information, which is also included in our data:
- Transliteration
- Tablet ID
- Tablet Surface Identifier
(Obverse,Reverse,Left,Right,Top,Bottom) - Line number
- Word Index (the index of a word in a line)
- Character Index (the index of a character in a line)
§4.7.5. Additional annotation contents depend on the purpose of the annotation and may or may not be reflected in the image metadata as well.
§5. Discussion
§5.0.1. Given the example data above, we discuss their applicability in different software environments and on the web, especially under sustainability and likely future development considerations.
§5.1. 3D raw processing support
§5.1.1. Concerning 3D raw processing support, we can see the need for documentation of the whole measurement process up to the publication of the post-processed mesh in a unified format. We think this should be the standard to be followed in the upcoming years. However, this idea requires the implementation and adoption of metadata standards in the 3D scanning community and possibly the extension or new standardization of metadata standards for several 3D formats. While supported by publications such as (Homburg et al. 2021), this effort will need time and effort to be supported in many mesh post-processing software. Whether the industry of scanner manufacturers is willing to support the metadata schema is yet another question to be answered and is far from certain. Still, we find it important to propose correct documentation of a metadata schema so that the reproducibility of cuneiform 3D scans can be assured.
§5.2. Annotation vocabularies
§5.2.1. In the Haft Tappeh project, we have created a customized vocabulary for 3D annotations. We have also experimented with defining a new selector for annotations in the W3C Web Annotation Data Model. The fact that this was necessary points toward a bigger problem in the Assyriology community: The definition and standardization of annotation vocabularies are currently not pursued to the author’s knowledge, not only on image data but also not on transliteration data. In the future, we hope that a consolidation approach would help to define annotation vocabularies with a different focus for:
- Cuneiform annotations prepared as machine learning features
- Cuneiform annotations prepared for the Assyriologist
§5.2.2. This will make annotations on cuneiform text and image data not only more targeted than a comment but it will also significantly improve the interoperability between the research disciplines. It will be even more important to focus on the interconnectivity between annotations, especially between parts of transliterations as a form of interpretation and between resources that are created on non-interpreted media (such as images and 3D scans as products of a scanning process). Annotations should not only be connected in data but their connections should also be made visible in user interfaces so that annotation in a cuneiform text should also be traceable to a feature on the 3D scan representation of a cuneiform tablet and vice versa. In conclusion, we advocate for a process of standardization of annotations so that a better digital representation of cuneiform in all mediums and their connections can be made.
§5.3. Feasibility of annotation types
§5.3.1. Previously, different methods of annotations have been discussed, and also how these annotations can be created. Also, we have discussed how annotations can be represented. Here, we would like to argue that the most useful type of annotations might be to store the annotations in the transliteration file format itself (e.g., X3D), because then the format would be self-contained, i.e., easy to use for a researcher. However, this type of self-containedness, while useful for the individual researcher, lacks the kind of interconnectivity one would expect from linked data annotations. Therefore, either a combination of both could be interesting, e.g., a 3D model with labels and annotation information or a 3D model with a sidecar file of web annotations that are actually processable in a 3D viewing software such as GigaMesh.
§5.4. Support of annotation standards in software
§5.4.1. A survey of 3D viewing software, which was conducted in the context of the Haft Tappeh project, showed that not many 3D viewers provide support to show annotations either visibly or with the possibility also to show annotation contents. [H]
3D Viewer | Anno | Content | Reference |
3DHOP | Yes | Yes | (Potenziani et al. 2015) |
Blender | Yes | No | (Kent 2015) |
GigaMesh | Yes | No | (Mara et al. 2010) |
Meshlab | Partial | No | (Cignoni et al. 2008) |
§5.4.2. Table 2 shows the software we tested in the Haft Tappeh project. 3DHOP, as a JavaScript mesh viewer, showed the most promising results for the integration of annotations of the different types presented previously. 3DHOP can load PLY and OBJ files so that annotations stored in other 3D PLY models can be viewed at once. However, 3DHOP, due to its Javascript-based implementation, may also process annotations from JSON-LD formats such as the W3C Web Annotation data model and may give user-customized feedback in a web browser, including the annotation contents.
§5.4.3. Figure 12, Figure 13 show that as the results of our experiments, it was already possible to show annotations and annotation content in 3DHOP. Still, it would be necessary to extend the 3DHOP software with a proper support for an extended web annotation data model definition.
§5.4.4. GigaMesh and Blender allow to load different meshes and enable the visualization of labelings but do not have support to show annotation contents as of the writing of this paper. The software Meshlab can display many meshes simultaneously but cannot visualize labelings and annotation contents. Therefore, we see the potential to include annotation support in a variety of applications and to create prototypes that can show the added values of annotation support.
§6. Application and example data
§6.1. This section discusses the application and sample data we have used as a showcase for the data products elaborated previously and explains how those data products were obtained.
§6.2. The data mentioned in this publication is published as open access on Zenodo[20] and includes
- Two 3D models HT 07-31-95 (P530974) and HS 1174 (P134485) in the formats PLY, X3D and COLLADA including scanning metadata
- Annotations on renderings of HT 07-31-95 (P530974) and HS 1174 (P134485) in JSON-LD
- 3D annotations calculated from the previously mentioned 2D annotations in PLY, JSON-LD and included within 3D model files (PLY, X3D, COLLADA)
§6.3. We publish a more comprehensive description of the dataset in a separate data publication in the Journal of Open Archaeological Data (JOAD) (Homburg et al. 2022).
§6.4. The motivations of this publication are twofold. At first, we would like to provide users with practical data for them to justify which annotation method is suitable for their needs. Secondly, we would like to stimulate a discussion within the community of computational Assyriology as to which publication method is preferable and which the community sees as the most likely to be followed in the future.
§6.5. Besides these motivations, the data publication might serve as a hint to software developers of 3D viewing software to include proper annotation support and as a motivation to enable the creation of proper annotations within said software environments.
§7. Conclusions
§7.0.1. This publication describes our lessons learned and best practices for preparing and processing 3D mesh data gained from cuneiform tablets. We discussed how 3D scans and their derivatives should be processed appropriately and documented before being published in a given repository online. Annotations created on 3D scans and annotations on 2D renderings as essential elements of the scientific discourse in and beyond Assyriology have been pointed out. Finally, we have discussed the shortcomings of current technologies and proposed solutions for these shortcomings in terms of dialogs between the Assyriology community and in the computer science community.
§7.1. Outlook
§7.1.1. We can see the potential of extending the Web Annotation Data Model for the representation of 3D annotations so that a standard annotation format may be used to express 3D annotations in a variety of software instead of relying on a variety of formats that each either do not support annotations or support only a fraction of the functionality that one would expect from full annotation support. If the Web Annotation Data Model can be used, this would also allow 3D annotation in semantic web databases. In any case, we advocate for the inclusion of 3D annotations in more 3D viewer implementations, not only on desktop computers but even more importantly on the web, as we can expect more and more 3D models to be shared online repositories in the following years. One way to move forward would be to extend the GigaMesh software framework and create a use case involving 3DHOP, facilitating the loading and processing of annotations in the formats we have discussed.
§7.2. Funding
§7.2.1. This research was funded by the DFG under grant no. 424957759[21]
§7.3. Availability of data and materials
§7.3.1. The test data we have shown in this publication is available as a separate research data publication and on Zenodo[22] and includes:
- Two 3D models HT 07-31-95 (P530974) and HS 1174 (P134485) in the formats PLY, X3D, and COLLADA, including scanning metadata
- Annotations on renderings of HT 07-31-95 (P530974) and HS 1174 (P134485) in JSON-LD
- 3D annotations calculated from the previously mentioned 2D annotations in PLY, JSON-LD and included within 3D model files (PLY, X3D, COLLADA)
Bibliography
Footnotes
- [1] https://digi.ub.uni-heidelberg.de/wgd/
- [2] https://5stardata.info
- [3] https://gigamesh.eu
- [4] https://www.3dsystems.com/videos/geomagic-wrap-mesh-doctor-analysis
- [5] https://pymeshfix.pyvista.org/index.html
- [6] https://gitlab.com/fcgl/GigaMesh
- [7] http://vcg.isti.cnr.it/nexus/
- [8] https://github.com/i3mainz/3dcap-md-gen
- [9] https://creativecommons.org/choose/zero/
- [10] http://www.fabbers.com/tech/STLFormat
- [11] https://www.programmfabrik.de/en/
- [12] HeiCuBeDa https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/data/IE8CCN
- [13] HeiCu3Da https://heidicon.ub.uni-heidelberg.de/search?ot=objektep=292
- [14] https://recogito.github.io/annotorious/
- [15] cf. https://wiki.hackzine.org/development/python/scale-conversion.html
- [16] https://shapely.readthedocs.io/en/stable/index.html
- [17] https://github.com/linuxlewis/tripy
- [18] https://www.3dhop.net
- [19] https://www.khronos.org/collada/
- [20] https://doi.org/10.5281/zenodo.6287172
- [21] https://gepris.dfg.de/gepris/projekt/424957759
- [22] https://doi.org/10.5281/zenodo.6287172
Version 1.0