Jump to content
Server Maintenance This Week. ×

Container field: How to insert all files from a folder.


This topic is 378 days old. Please don't post here. Open a new topic instead.

Recommended Posts

I have 63,000+ email files (file extension: .eml) in a folder on my Mac ( filemac:/Macintosh HD/Users/onemac/Documents/emails ). I want to import each of these .eml files into a container field ( field name: Email_Container ). What is the best way to handle this?  I’m wanting a separate record for each email. Regarding the container field contents, I’m wanting to "Store only a reference to the file." It appears that I need to use Insert File (not Import Records) because I’ll need to later use GetContainerAttribute to pullout the file name for parsing purposes ... When I use Import Records for these .eml files, the GetContainerAttribute does not work. However, if I use Insert File, it works. The script step Insert File does not seem to support a path to a folder (only to a file). From what I understand, it looks like I’ll need to use the Get Folder Path script step along with the Get(DocumentPath) function, but I must be doing something wrong. This should be really easy, but I not getting my mind around this one. Help with writing this one would certainly be appreciated. 🙂

Link to comment
Share on other sites

I think the easiest way would be to open the folder in Finder, select all files, copy and paste into a text file. Then import the text file (as .tab or .csv) and use a calculation field to return the full path to each file. If you set the result type to Container, the effect will be the same as inserting the file as reference only. Or you could go an extra step and use Replace Field Contents to populate an actual container field with the calculated path.

 

  • Like 1
Link to comment
Share on other sites

1 hour ago, comment said:

I think the easiest way would be to open the folder in Finder, select all files, copy and paste into a text file. Then import the text file (as .tab or .csv) and use a calculation field to return the full path to each file. If you set the result type to Container, the effect will be the same as inserting the file as reference only. Or you could go an extra step and use Replace Field Contents to populate an actual container field with the calculated path.

 

Comment,

You're great. Both of your options worked like a charm. 🙂

As far as parsing these .eml file names into three separate calculation fields, do you have a stream-lined solution?

Example file name #1 ...

New EOB Posting - "EOB Notice" <[email protected]> - 2010-01-10 0944.eml

cSubject: New EOB Posting 

cEmail: [email protected]

cDate: 2010-01-10 (YYYY-MM-DD format - Calculation result of TEXT (vs DATE) is fine)

Example file name #2 ...

BMI Work Registration Report - <[email protected]> - 2012-06-21 1534.eml

cSubject: BMI Work Registration Report 

cEmail: [email protected]

cDate: 2012-06-21 (YYYY-MM-DD format - Calculation result of TEXT (vs DATE) is fine)

 

Link to comment
Share on other sites

I don't know if it's safe to generalize from only 2 examples, but try:

cSubject =

Left ( Filename ; Position ( Filename ; " - " ; 1 ; 1 ) - 1 )

cEmail =

Let ( [
start = Position ( Filename ; "<" ; 1 ; 1 ) + 1 ;
end = Position ( Filename ; ">" ; start ; 1 )
] ;
Middle ( Filename ; start ; end - start )
)

cDate =

Middle ( Filename ; Position ( Filename ; " - " ; 1 ; 2 ) + 3 ; 10 )

 

Edited by comment
  • Like 1
Link to comment
Share on other sites

Hey comment,

 

Thanks for your attention to detail. 🙂

 

I have imported 66,884 emails successfully. 

 

You were TOTALLY correct when you said: 

“I don't know if it's safe to generalize from only 2 examples, but try:"

 

cDate parsing is failing with email addresses like this …

* Statements and Payment Reports Request - HarryFox - NoReply <[email protected]> - 2020-01-22 0941.eml

* It takes less than 10 seconds to rate your latest SugarSync Suppor...SugarSync <[email protected]> - 2019-08-08 1607.eml

 

cEmail parsing is failing with this type of email address …

* Form submission from peckmusicgroup.net - [email protected] - 2007-06-21 1645-2.eml

 

cDate parsing and cEmail parsing are failing on this type of email address …

* My Email Address Has Changed Re_ Over any earthly rule...- [email protected] - 2016-11-08 1547.eml

 

Keep in mind (this may be helpful), there are certain email addressed where the @ symbol appears more than once in the .eml file name …

* An Seong Bok has just paid for your invoice [email protected]" <[email protected]> - 2012-09-20 1835.eml

 

Thoughts?

 

Thanks

Link to comment
Share on other sites

I think we need to understand the rules that were used to construct these strings, before we can formulate the rules to take them apart.

Looking at your added examples, I must admit I don't see the logic. I thought there were 3 main components separated by " - " but now you show an example with 3 such separators and others with only one. Possibly the date could be extracted by looking for the last separator instead of the second one, but that still leaves the other two components in a rather hazy state. 

I can help you with reversing the logic, but I have no advantage in deducing the original one.

 

  • Like 1
Link to comment
Share on other sites

This is a challenging situation for sure. 🤔

The accuracy of the cDate is most the most important element. 

The accuracy of the cEmail is something I'd like to have fairly close ... I could alway manually adjust some.  I believe there are very few .eml file names with more than one @ symbol, so maybe that's something that helps. 

The cSubject is not that important.

Thanks 🙂

Link to comment
Share on other sites

I don't see how that's moving us forward. As I said, the date could possibly be extracted more reliably using:

Middle ( Filename ; Position ( Filename ; " - " ; Length ( Filename ) ; -1 ) + 3 ; 10 )

As for the email, if we could assume that the email you want contains the last @ character in the entire string AND that the email is surrounded either by spaces or by angle brackets, then we could do something with that. But I am purely guessing at this point.

 

  • Like 1
Link to comment
Share on other sites

It looks like your latest cDate is EXTREMELY accurate. 🙂 🙂

Your latest thoughts concern the cEmail should nail it  ...

The email address we want to parse contains the last@ character in the entire string AND that email address is surrounded either by spaces or by angle brackets.

Thanks!

 

Link to comment
Share on other sites

Ok, so try:

Let ( [
mask = Substitute ( Filename ; [ "<" ; " " ] ; [ ">" ; " " ] ) ;
at = Position ( mask ; "@" ; 1 ; PatternCount ( Filename ; "@" ) ) ;
start = Position ( mask ; " " ; at ; -1 ) ;
end = Position ( mask ; " " ; at ; 1 )
] ;
Trim ( Middle ( mask ; start ; end - start ) )
)

 

  • Like 1
Link to comment
Share on other sites

This topic is 378 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.