Docparser offers various filters to extract repeating data sets from documents. This article describes how you can extract single repeating text values (e.g. Names, Street Names) from a document that contains multiple data sets. This method assumes that an anchor keyword is located close to each data value that you want to extract.
A common case for this parsing filter would be a list of data entries, where each individual data field is located after a label. For example, data which looks like this:
Name: John
Street: 123, High Street
Name: Jane
Street: 456, Upper Street
Name: Mark
Street: 789, Lower Street
Alternative methods to parse repeating text data would be our table extraction tool or our filter for parsing repeating text blocks.
In the first step of the parsing rule editor, choose the template "Repeating Text Values":
Click on "Continue" and enter an anchor keyword which is located next to the values you want to extract. Our Repeating Text Values will then return each line which contains the keyword you entered. In the example above we could enter "Name:" as the anchor keyword.
Finally, you can add a horizontal offset value to move the position where the data extraction starts to the left and to the right. You can also add a row offset value in case the value you want to capture is not located on the same line as the anchor keyword. Finally, you can also define after how many characters the data value ends.