tripdragon Posted March 11, 2004 Posted March 11, 2004 Hi I need help in cleaning up text edl files. They out put like this 0016 BLK V D 060 00:00:00:00 00:00:02:00 01:06:45;26 01:06:47;26 0017 AUX V C 00:00:00:03 00:00:08:25 01:09:35;29 01:09:44;21 0018 AUX V C 00:00:00:00 00:00:10:17 01:09:47;11 01:09:57;28 0019 AUX V C 00:00:00:00 00:00:10:08 01:09:59;24 01:10:10;02 0020 6408 V C 00:26:26;28 00:26:29;17 01:10:41;07 01:10:57;13 PEG A 016 00:00:00 6408 * REPAIR: FROM SOURCE TRUE SPEED IS 4.869000 FPS 0021 6408 V C 00:26:26;28 00:26:27;29 01:11:00;17 01:11:07;01 PEG A 016 00:00:00 6408 * REPAIR: FROM SOURCE TRUE SPEED IS 4.869000 FPS 0022 6408 V C 00:26:28;00 00:26:28;00 01:11:07;01 01:11:07;01 0022 BLK V D 060 00:00:00:00 00:00:02:00 01:11:07;01 01:11:09;01 And I need them to be cleaned up like this 0008 AUX 00:00:00:00 00:00:05:28 01:04:40;00 01:04:45;28 0009 AUX 00:00:05:28 00:00:05:28 01:04:45;28 01:04:45;28 0009 AUX 00:00:00:22 00:00:06:18 01:04:45;28 01:04:51;24 0010 BLK 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19 I have used Tex edit plus to do half of the work to clean it up this far. But these files are HUGE! And it takes to long to do this by hand. Are there any links to text edit prasing and such stuff. ? From the clean file I put tabs in each space so I can then import it into Filemaker pro. So I knopw that it does work, I just have to automate the text clean up part some how. Version: v6.x Platform: Mac OS X Panther
CyborgSam Posted March 11, 2004 Posted March 11, 2004 I wrote an OS X text parsing AppleScript that you can modify to do this. It's attached. There is no documentation, I hacked up something a lot more complex I wrote to make this. The downside is that it's slow. Walk away & eat dinner...
Fenton Posted March 12, 2004 Posted March 12, 2004 Nice applet. I've just glanced through. I would suggest to tripdragon that he might want to get hold of either Tex-Edit Plus ($10?, shareware) or TextWrangler ($50). Both of them come with some AppleScript examples. Both are also "recordable." TextWrangler is the small brother of BBEdit: http://www.barebones.com. It is awesomely powerful and fast. I once edited a 14MB text file no problem (with BBEdit Lite). TextWrangler supports "grep," which once you learn a little can do wonders, esp separating text from numbers; basically any kind of pattern-matching. You could also probably do this stuff with grep in Unix for free. But there would be more of a learning curve.
tripdragon Posted March 12, 2004 Author Posted March 12, 2004 CyborgSam said: I wrote an OS X text parsing AppleScript that you can modify to do this. It's attached. There is no documentation, I hacked up something a lot more complex I wrote to make this. The downside is that it's slow. Walk away & eat dinner... I tried the applet. It did nothing different to the file. But it did create a new file which had no change. Text Wrangler is very nice. It does almost all of what I need. One question though. With the find all instances feature that shows me all of the lines how do I get it to select all oof tose lines at a time to delete ? That would compleate half of the work right there.
Lee Smith Posted March 12, 2004 Posted March 12, 2004 Hi tripdragon, Your problem touches upon one of my favorite areas. I'll start by saying that I agree with Fenton, the simplest way to go about this is to use one of the products by Bare Bones Software, or one with the abilities that BBEdit TextWrangler have. Since I'm not familiar with "Tex edit plus" I don't know it's capabilities, and looking at the site information it doesn't mention Grep at all, and it does mention a file size limit. On the other hand, Bare Bones' products have some very handy tools, including as Fenton said, the ability to use Grep Patterns. Greps allow you to Find and Replace multiple patterns, you can also use them to cut the lines containing patterns. It's not clear to me by looking at your example, what you do with some of the lines because the "Before" text isn't the same as your "Clean Up" text. Perhaps a little more impute here could save you a lot of time in your clean up. HTH Lee Version: v6.x Platform: Mac OS 9
tripdragon Posted March 12, 2004 Author Posted March 12, 2004 Sure. ! I almost figured out the method to use. With BBedit or text wrangler. I found that I could vertical select text . !! WOO HOO! that will help so much for other stuff to .. The messy text is like so... * REPAIR: TO SOURCE TRUE SPEED IS 4.869000 FPS 0004 BLK V C 00:00:00:00 00:00:00:00 01:02:46;02 01:02:46;02 0004 AUX V D 020 00:00:18:03 00:00:23:03 01:02:46;02 01:02:51;02 0005 AUX V C 00:00:23:03 00:00:23:03 01:02:51;02 01:02:51;02 0005 BLK V D 020 00:00:00:00 00:00:00:20 01:02:51;02 01:02:51;22 0006 BLK V C 00:00:00:00 00:00:00:00 01:03:01;25 01:03:01;25 0006 AUX V D 020 00:00:18:03 00:00:22:27 01:03:01;25 01:03:06;19 0007 BLK V C 00:00:00:00 00:00:00:00 01:04:15;17 01:04:15;17 0007 AUX V D 020 00:00:18:03 00:00:22:27 01:04:15;17 01:04:20;11 0008 BLK V C 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19 0008 AUX V D 020 00:00:18:03 00:00:22:27 01:05:09;19 01:05:14;13 0009 6510 V C 01:26:19;15 01:26:27;18 01:10:41;07 01:10:57;13 PEG A 050 00:00:00 6510 0010 6510 V C 01:26:19;15 01:26:22;22 01:11:00;17 01:11:07;01 PEG A 050 00:00:00 6510 0011 6510 V C 01:26:22;22 01:26:22;22 01:11:07;01 01:11:07;01 0011 BLK V D 060 00:00:00:00 00:00:02:00 01:11:07;01 01:11:09;01 And studing that text I have found that the "V" and "C" and "D" in the center is always in the same row, this also includes the three grouped numbers. like this: V D 060 V D 060 V D 060 V D 060 V D 060 V D 060 They are also always in a line on the charecters number 13-27. So if I can remove all charicters on those lines 13-27 that will get half done. Then if I could remove all lines that begin with strings that I choose to not be useful Like " * REPAIR: --- So if I could wild card all lines that begin with something like * REPAIR: and PEG A That would de the full compleate clean up ! Yahooo! No more copy paste till i'm blue in the face Version: v6.x Platform: Mac OS X Panther
Lee Smith Posted March 12, 2004 Posted March 12, 2004 You write a grep using the pipe character two separate the finds. In other words, you can cut the lines containing REPAIR: and PEG A in one swoop. Using the tools, select "Process Lines Containing" than copy this in the box PEG|REPAIR: then, select these buttons: Use Grep Case Sensitive Delete Match Lines I own BBEdit 5.1, and it doesn't have the Vertical Select Text ability, how do you do this? Lee
tripdragon Posted March 12, 2004 Author Posted March 12, 2004 also I have found somebody helping me with this bit of code. I have tryied it in terminal but no go sed -e '/^[0-9]{4,}/!d;s/([^ ]* [^ ]*)[A-Z0-9 ]* (.*)/1 2/' infile > outfile
Lee Smith Posted March 12, 2004 Posted March 12, 2004 I'm guessing that the "terminal" is unix or something. I can't get it to work either. Is this Perl or something. It does contain some RegEx in it, but it appears to be a script that will process your file or something. Did you ask the other person helping you about it not working? lee
tripdragon Posted March 12, 2004 Author Posted March 12, 2004 sweet! Happy! That did it! A few things now. to do the vertical select hold down option How can I automate this as an apple script or someting near it. What command do I use to do the make all spaces into into tabs and all tabs into spaces ?
Lee Smith Posted March 12, 2004 Posted March 12, 2004 If there is a pattern of more than one space (2 or 3) in between data, you can use the Entab under the Text Menu. but one for one, you have to do a search for (space, just enter from the keyboard) and replace with t (t= tab in BBEdit)
Fenton Posted March 12, 2004 Posted March 12, 2004 When you want to automate conversion of a large text file you don't want to be selecting columns with the mouse (though this is a cool feature, very useful for fixing up irregular spreadsheet files or tables copied from the internet). Here's what you want to do: Find lines beginning with any text (not numbers) ^[^d].* Replace with nothing Find space, V C or V D followed by a space, 3 numbers and a space (watch the spaces, grep) V [CD] ddd Replace with t Find spaces (single or multiple, grep) + Replace t [OK, the above is missing leading spaces in places, 'cause HTML strips them. Use the AppleScript below.] When I said BBEdit (or TextWrangler) is "recordable," it means you can manually turn on AppleScript Recording, under the Scripts icon, then do the preceeding search/replaces, and it will record them as an AppleScript script, then ask you to save it. You can then put the resulting file into BBEdit's Scripts folder and it will be available. You can even assign it a command key. This is what it looks like. It adds all the default parameters, which you can generally ignore; or even remove, but there's no need: tell application "BBEdit" activate replace "^[^d].*" using "" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:true, extend selection:false} replace " V [CD] ddd " using "t" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:true, match words:true, extend selection:false} replace " +" using "t" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:true, extend selection:false} replace "rr+" using "r" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:true, extend selection:false} end tell
CyborgSam Posted March 13, 2004 Posted March 13, 2004 I should have been more clear that the applet does NOT do any processing as I posted it. You have to enter the AppleScript code to do this. Sounds like your finding BBEdit the easiest way to go. Version: v7.x Platform: Mac OS X Panther
tripdragon Posted March 15, 2004 Author Posted March 15, 2004 Fenton said: tell application "BBEdit" activate replace "^[^d].*" using "" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:true, extend selection:false} replace " V [CD] ddd " using "t" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:true, match words:true, extend selection:false} replace " +" using "t" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:true, extend selection:false} replace "rr+" using "r" searching in text 1 of text window 1 options {search mode:grep, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:true, extend selection:false} end tell This stuff! Is great! I am very far busy right now. I have more question and I will send them here soon. Thankyou
Recommended Posts
This topic is 7561 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now