It is proposed to collect ca. 50,000 markes, initials and signatures (collectively known as signoff data) for subsequent pattern recognition and machine learning work.
- Should we collect more data?
- Do we require different amounts of data for the three proposed categories markes, initials and signatures due to different visual characteristics of these categories?
- What benefits do we anticipate from greater volumes of data? For example, doubling or tripling the volumes?
- What are the collection and processing costs of acquiring and working with greater volumes of data?
- Does the quantity of data we require vary accoring to specific machine learning techniques we may use?