You're reading for free via Emad Dehnavi's Friend Link. Become a member to access the best of Medium.

Member-only story

Protect Your Sensitive Data with Piiranha-v1

Emad Dehnavi
2 min readSep 14, 2024

Are you worried about protecting your users’ sensitive data from unauthorized access? Piiranha-v1 can help. It is a strong and accurate open model for detecting Personally Identifiable Information (PII).

Protect Your Sensitive Data with Piiranha-v1

if you are not a premium member, you can read it from here.

Piiranha-v1 is released under the MIT License, which is very permissive. It is a small encoder model with 280M parameters and it supports six languages: English, Spanish, French, German, Italian, and Dutch. The model has been fine-tuned to detect PII with very high accuracy.

Top Performance

Piiranha-v1 has very good performance results:

  • 98.27% PII token detection rate
  • 99.44% classification accuracy
  • 100% accuracy for emails and almost perfect for passwords

These results show the model can detect sensitive information very well, making it a good tool for any organization that works with PII.

Supported PII Types

Piiranha-v1 can detect 17 types of PII in six languages, including:

  • Account Number
  • Building Number
  • City
  • Credit Card Number
  • Date of Birth
  • Driver’s License
  • Email
  • First Name
  • Last Name
  • ID Card
  • Password
  • Social Security Number
  • Street Address
  • Tax Number
  • Phone Number
  • Username
  • Zipcode

Context Length and Fine-Tuning

The context length of Piiranha-v1 is 256. This means if the text is too long, it should be split into smaller pieces for better results. The model is a fine-tuned version of Microsoft’s mdeberta-v3-base, which makes it a very reliable option for PII detection.

Conclusion

Piiranha-v1 is an open model that performs very well in detecting different types of PII in six languages. With 100% accuracy for emails and almost perfect accuracy for passwords, it is a great tool to protect sensitive data from unauthorized access. It is released under the MIT License, so it can be used and adapted freely to fit your organization’s needs.

You can try Piiranha-v1 in Huggingface

If you like this post, it worth a world to me if you clap and share it with your friends and network and 🔔 Follow me on : Medium | Youtube | Linkedin | Github

Emad Dehnavi
Emad Dehnavi

Written by Emad Dehnavi

With 8 years as a software engineer, I write about AI and technology in a simple way. My goal is to make these topics easy and interesting for everyone.

No responses yet

Write a response