Generate
Back to Blog
Security professional reviewing threat intelligence reports with synthetic identities

A penetration tester sitting down to assess a client's web application faces an immediate practical problem. The registration form asks for a name, email address, phone number, date of birth, and postal address. Typing "test test" into every field will trip fraud filters before the first page loads. Using real personal data, even the tester's own, creates a liability trail that no compliance officer wants to see in a post-engagement report. The third option is a synthetic identity: a fabricated persona with enough internal consistency to pass validation gates and enough separation from real people to keep the engagement legally clean.

This isn't a niche concern. Security professionals across disciplines, from penetration testing to threat intelligence to incident response, need realistic but fabricated identity data as a routine operational requirement. The quality of that data directly affects whether the engagement succeeds or stalls at the first form validation.

Penetration Testing: Where Validation Logic Meets Fabricated Data

Web application assessments start with account creation. The registration flow is often the first attack surface being tested, and it's also the gateway to everything behind it. A tester who can't create a realistic account can't access the authenticated portions of the application, which is where the interesting vulnerabilities tend to live.

Modern registration forms validate more than format. They check phone number prefixes against country codes. They verify that postal codes match the declared city. Some run the email domain against a blocklist of known temporary email providers. A few cross-reference the billing address against the card's BIN range. Each validation layer is a gate that obviously fake data won't pass.

A synthetic profile that holds up needs all fields to be internally consistent. A UK phone number starting with +44, paired with a London address, a matching postal code, and an email on a provider that isn't on any disposable-email blocklist. The name should be plausible for the declared nationality. The date of birth should produce an age that's consistent with the rest of the profile. If the form asks for a national insurance number, it needs to follow the correct format with valid prefix combinations.

Testing payment flows adds another layer. The card number needs to pass Luhn validation, the BIN prefix should correspond to the declared country, and the card network (Visa, Mastercard, Amex) determines the expected digit count and CVV length. A 15-digit Amex number with a 4-digit CVV behaves differently in the application than a 16-digit Visa with a 3-digit CVV. Testing both paths requires synthetic cards for both networks.

The alternative, which teams sometimes resort to, is using real prepaid cards bought with cash. This works but introduces its own problems: cost, logistical overhead, traceability through CCTV at point of purchase, and the uncomfortable reality that the card is linked to a real financial instrument. Synthetic numbers that pass client-side validation without connecting to a real account are cleaner in every respect.

Threat Intelligence: Building Disposable Personas for Adversary Engagement

Threat intelligence analysts monitor criminal forums, dark web marketplaces, and social media channels where threat actors operate. Doing this under a real identity is obviously a non-starter. Doing it under a transparently fake identity is nearly as bad, because experienced threat actors vet new forum members and will spot thin personas quickly.

A credible forum persona needs a backstory that's consistent across platforms. The username should have a history. The email address should come from a provider that isn't associated with throwaway accounts. The profile details, if visible, should tell a coherent story. An analyst posing as a mid-level cybercriminal in an Eastern European forum needs a persona that looks like someone who's been around: a plausible name for the region, an email address on a provider popular in that geography, a posting style that matches the community's norms.

The identity data underpinning this persona needs to be generated, not borrowed. Using a real person's information, even someone who doesn't exist in any obvious database, creates ethical and legal risks. If the persona's details happen to match a real individual, any actions taken under that persona could be attributed to that person. Synthetic data eliminates this risk because it's generated from patterns, not derived from real records.

Maintaining multiple concurrent personas adds complexity. An analyst might operate three or four identities across different forums, each with a distinct profile and background. The identities can't overlap. A shared email domain, a similar username pattern, or inconsistent timezone activity across personas can link them together. Generating each persona independently, with its own country, naming conventions, and contact details, creates natural separation.

Persona lifecycle management matters too. Disposable personas that get burned after a single engagement are expensive to build if they require manual construction. Having a generator that produces a complete, consistent identity on demand means burned personas are replaced in minutes rather than hours. The analyst's time goes to intelligence gathering instead of identity crafting.

Incident Response: Accessing Compromised Services Under Time Pressure

When a breach is discovered and the incident response team needs to assess the damage, they sometimes need to create accounts on the compromised service to understand the attacker's perspective. What does a new user see? What data is exposed through the registration flow? Can a newly created account access data it shouldn't?

Time pressure during incident response is severe. The team doesn't have the luxury of spending an hour crafting a believable test identity. They need to register an account and start investigating. A pre-generated synthetic profile, stored in the team's runbook, provides a ready-to-use identity that passes registration validation immediately.

The identity also needs to be traceable to the IR team's activities after the fact. Using a real team member's details creates confusion in log analysis ("was this the attacker or the incident responder?"). Using obviously fake details ("John Test, 123 Fake Street") may fail validation entirely. A synthetic identity with a naming convention the team recognises internally (but which looks natural externally) threads that needle. The team knows which accounts are theirs. The log analysis distinguishes responder activity from attacker activity. The registration flow doesn't reject the data.

Some incident response scenarios involve creating accounts on services that have been compromised by a third party. If the service is a client's platform, the engagement scope covers this. If the service is a third-party platform where the attacker is operating, the legal analysis becomes more complex. In either case, using synthetic rather than real identity data keeps the IR team's personal information out of systems that may already be under adversary control.

Red Team Operations: Personas That Survive Surface-Level Scrutiny

Red team engagements simulate real-world attacks against the client's organisation. The team operates as an adversary would, using social engineering, technical exploitation, and physical access testing to identify weaknesses. The personas used in these engagements need to withstand casual verification.

A pretext call to the help desk requires a name, an employee ID format (if targeting internal support), or a plausible customer identity (if targeting customer-facing support). The operator needs to know the persona's details well enough to answer follow-up questions without hesitation. Date of birth? Address? Last four of the phone number? Stumbling on these details breaks the pretext. Having a complete synthetic profile that the operator has memorised ensures consistency under pressure.

Physical access testing raises the stakes further. A red team member approaching a building's reception desk with a fabricated visitor identity needs that identity to be coherent if the receptionist decides to verify it. The name on the visitor badge should match the name given verbally. The company the "visitor" claims to represent should be real enough to survive a quick search. The phone number on the visitor log should ring somewhere controlled by the team.

Email-based social engineering (phishing simulations within the engagement scope) requires sender identities that don't immediately look fake. A phishing email from "noreply@totallyfakecompany.test" won't test anything useful. An email from a plausible vendor identity, with a realistic sender name and a domain that's been set up to look legitimate, tests the organisation's actual detection capabilities. The synthetic identity behind the sender needs to be consistent enough that a cautious employee who Googles the sender name doesn't immediately find evidence of fabrication.

Multi-stage social engineering is where persona depth becomes most apparent. An initial phone call establishes rapport. A follow-up email references the conversation. A third interaction requests access or information. The persona needs to be consistent across all three touchpoints. Details mentioned in the first call need to hold up in the email. The backstory can't contradict itself. A generator that produces complete personas with interconnected details provides the foundation. The operator adds the behavioural layer.

Authorisation Boundaries and Record Keeping

Synthetic identities used in security work exist within a legal and contractual framework. The scope of work defines what the testers are authorised to do. The identity data used should be documented as part of the engagement records.

Rules of engagement should specify that synthetic identities are being used, which identities were created, and on which systems they were registered. This documentation protects both the security team and the client. If a synthetic account triggers an alert in the client's security monitoring, the incident responders need to be able to verify quickly that it's an authorised test activity rather than a real attack.

Data handling after the engagement matters too. Synthetic accounts created during testing should be decommissioned. Test data should be removed from the client's systems. The synthetic identities themselves should be retired and not reused across unrelated engagements. Cross-contamination of personas between clients introduces both operational security risks and potential conflicts of interest.

The legal framework varies by jurisdiction, but the general principle is consistent: using synthetic data for authorised security testing is legally cleaner than using real personal data. There's no data subject whose rights are at stake. There's no GDPR concern. There's no risk that a test account's data will be mistaken for a real customer's information and processed accordingly.

Choosing the Right Level of Fidelity

Not every engagement needs a fully detailed synthetic persona. The level of fidelity should match the operational requirement.

A web application penetration test might need only name, email, phone, and address. The persona doesn't need a backstory or a social media presence. It needs to pass form validation and create an authenticated session.

A threat intelligence engagement might need multiple personas with distinct geographic profiles, each with culturally appropriate names, regional email providers, and timezone-consistent activity patterns. The identities need to survive scrutiny from suspicious human reviewers, not just automated validation.

A red team engagement might need a single deep persona with a verifiable-seeming employment history, a phone number that's answered by a team member, and a social media profile that's been aged for weeks before the engagement begins. This level of preparation is expensive, and knowing which engagements require it and which don't is a planning skill.

Generators like Another.IO are useful across this spectrum because they handle the base layer: consistent, format-valid identity data for any country. The security team layers operational depth on top depending on the engagement requirements. A pen tester uses the profile as-is. A red team operator uses it as the foundation for a deeper cover identity. A threat intelligence analyst uses it as the seed for a forum persona.

Over-engineering a persona wastes preparation time. Under-engineering one risks blowing the operation at the first validation gate. Matching fidelity to the task is a skill that improves with experience, and having a generator that produces any level of detail on demand removes the bottleneck from identity creation and puts it back where it belongs: on operational planning.