Batch Process Drawings¶

Imagine a scenario when you are processing hundreds of engineering drawings from an old archive. You would like to process them all at once, Werk24 allows you to process multiple files at the same time. All you need to do is follow the steps in this tutorial.

What will I learn?¶

Collectively processing multiple technical drawings.
Avoiding loss of data due to same file names. For example: file_name.pdf and file_name.png could be two differrent file. This tutorial will allow you to process multiple files with same name.

Step 1: Import all the requirements¶

Let's get imports out of the way at the very beginning.

import asyncio 
import glob 
import json 
import os 
from functools import partial 
from pathlib import Path 

from werk24 import Hook, W24AskTitleBlock 
from werk24.models.techread import W24TechreadMessage 
from werk24.utils import w24_read_async

Step 2: Define folder paths¶

There are many ways of defining the input and output paths. For pure simplicity, we choose to define these paths as constants. Please update these paths in accordance with your own setup.

INPUT_FOLDER = Path("/abs/path/to/input/") 
OUTOUT_FOLDER = Path("/abs/path/to/output")

Step 3: Obtain and process all files in the input folder¶

Our first functional code snippet takes a path as input, locates all files in the given folder and processes them.

async def _process_folder(path: Path) -> None: 
    """ Process all files in the provided path

    Args: 
        path (str): Absolute or relative path of the folder 
            to process 
    """ 
    files = [
        f 
        for f in glob.glob(path/"*")
        if os.path.isfile(f)
    ]

    for filename in files: 
        await _process_file(filename)

Step 4: Process individual file¶

Next, we need to define what "process this file" should mean. This function loads the drawing bytes from the absolute path, defines what information shall be extracted, what function shall be called when the information is available and start the process.

async def _process_file(path: str) -> None: 
    """ Process the file located at `path`
    Args: 
        path (str): Path of the file 
    """ 
    with open(path, 'rb') as fid: 
        drawing_bytes = fid.read() 

    hooks = [ 
        Hook( 
            ask=W24AskTitleBlock(), 
            function=partial(_store_response, input_path=path) 
        ) 
    ] 
    await w24_read_async(drawing_bytes, hooks)

Step 5: Store the response in a file¶

The last step calls a function _store_response whenever the API returns information. This _store_response function stores can be a synchronous or an asynchronous function. The Werk24 framework will call this function accordingly. As this function is IO-heavy, we are opting for an asynchronous function.

async def _store_response( 
    message: W24TechreadMessage, 
    input_path: str 
) -> None: 
    """ Store the API response in a file

    Args: 
        message (W24TechreadMessage): Message was we received 
            if from Werk24 

        filename (str): Filename of the original file that 
            was processed. This will automatically be converted 
            to an output filename 
    """ 
    output_filename = _make_output_path(input_path) 
    with open(output_filename, "w+") as file: 
        print(json.dumps(message.payload_dict, file, indent=6))

Step 6: Convert the input filename into output path¶

The last step converted the input_path into an output_filename. We still need to define this function. To ensure that previously written information is not overwritten, this function will check whether the file name exists already and find a path that is new.

def _make_output_path( 
    input_path: Path, 
    max_duplicates: int = 100 
) -> Path: 
    """  Convert an input path into an output path

    Args: 
        input_filename (Path): filename of the original 
            file 

    Returns: 
        Path: output path 

    Raises:
        RuntimeError: raised when we could not generate an
            output path
    """ 
    basepath = os.path.basename(input_path) 
    parts = os.path.splitext(basepath) 

    # return the simple path if it does not exist yet 
    path = OUTOUT_FOLDER / f"{parts[0]}.json" 
    if not os.path.exists(path): 
        return path 

    # otherwise try several times to find a path that 
    # does not exist yet 
    for i in range(max_duplicates): 
        path = OUTOUT_FOLDER / f"{parts[0]}({i+1}).json" 
        if not os.path.exists(path): 
            return path 

    raise RuntimeError( 
        f"Could not generate a new output filename for `{input_path}`")

Step 7: Call the main function¶

Done - the last thing that remains is calling the function.

if __name__ == "__main__": 
    asyncio.run(_process_folder(INPUT_FOLDER))