I receive encrypted data files into a directory where they are unencrypted by a nightly process.
The unencrypted files are then loaded.
I need to write a Java method to return an array containing the filenames of all the unencrypted files in the directory.
The encrytion method is openSSL (aes128).
So far I have tried
getType() but it returns
content/unknown for both the unencrypted data files and the encrypted file too.
I am now looking into reading the first two lines of each file and checking the characters returned to see if the file is encrypted.
What I need to know is, is there a better way of doing this?
I could also live with testing whether the file contents are XML or plain text rather than testing whether the file is encrypted if that makes the solution easier?
Use a naming convention so the decrypted files have a different extension, or put the decrypted files in a different directory.
Edit: given the constraints you mention, I think you'll have to do what you suggest in the question. This http://www.dansdata.com/gz125.htm is an interesting guide to the problems of file identification. You could also shell out to the
file unix command if it works with your particular file types.
I can't comment on AES, but several encryption standards (PGP comes to mind) allow for a common header or will have a common attribute (like a signature block or public key).
Your plan to check for XML would be fine, jsut feed it through an XML parser. However that only tells you if the file is XML and not if it is encrypted.
How would you distinguish plain text from an encrypted file? Its all just text isn't it?
What implementation of aes are you using? Which libraries are you using?
Are the encrypted files just encrypted or are they also base64 encoded? How are these files stored on the filesystem? Written directly or through another mechanism?
Based on a comment on the question from @rosco, is there any reason that the file cannot be decrypted to check if it is encrypted? Are they very large files? Is your application the one that decrypts it or are you just a middleman? Are there any security constraints that would prevent you from decrypting it?
Can you put business rules in place? For example, state that the submissions will be rejected if they are not encrypted?