This is a discussion on "Parsing text" within the PHP Forum section. This forum, and the thread "Parsing text are both part of the Program Your Website category.
|
|
|
|
|
![]() |
||
Parsing text
|
||
| Notices |
![]() |
|
|
LinkBack | Thread Tools |
|
|||
|
Parsing text
I'm just learning php and need to parse some large text files. I was going to use Perl for this, but I read somewhere that "php is the new Perl"... I'm guessing that's an overstatement, but it got me to thinking that I may as well do this (if possible) in php so I can learn from it.
The text file is from a database that I need to replicate. I don't have admin access to it, so I'm using print-to-text-file output of a report. The text file looks something like this...
1. Find the (first instance of the first) identifier 2. Write it (and the associated field value) to the first two elements of a multidimensional array 3. Find the next line containing the identifier 4. Split the line into the identifier and the field value 5. Write the field value to an array element 6. When the identifier is not found, restart the loop from the next identifier 7. Use an insert query to add the values from the array to the new database. Gotchas: Some of the fields (e.g. Notes) will be multiple line fields. These may be interrupted by the aforementioned garbage lines. (Seems like the way to get around this is to get rid of the garbage lines first.) Also, some fields may contain null values. Questions: 1. Is php well-suited (or well enough) for doing line-by-line text parsing? 2. What functions should I be looking into in order to accomplish this? |
|
|
|
|||
|
Re: Parsing text
Both Perl and PHP and others for that matter are suited to your purpose.
You need to be looking at what are known as Regular Expressions. Make a large pot of coffee because you are going to have to do some serious reading and understanding. |
|
|||
|
Re: Parsing text
Thanks for the reply, ukgeoff!
I'm familiar with regular expresions and have used them before. The part that I find most frustrating is that they always seem to want to select much more than I intend. How greedy are they in php? Do they stop at line breaks in php? (And/or can you tell them how greedy to be/where to stop?) Beyond regular expressions, I was wondering about what functions to use in order to split lines. Also, what would I use to (for example) select from line 77, column 12 to the end of line 80? |
|
|||
|
Re: Parsing text
You can specify if their to be greedy or not.
I recommend a regex tool both as a training tool, it comes with great documentation, and as a build and test tool. RegexBuddy: Learn, Create, Understand, Test, Use and Save Regular Expression With regard to the second idea. You would probably have to read the bytes into a variable looking for the EOL markers. But I haven't really thought that one through. |
|
|||
|
Re: Parsing text
About splitting the lines up, i think you could use php's explode to split up the lines and then foreach to go through the array value (would be each line in this case).
I didnt look into it, but i think im along the right lines here. |
|
|||
|
Re: Parsing text
Is there any way to tell that the data for an ID is a first name, last name, address, etc? (maybe that's what the garbage does some how?) If not, are the first names always first, then the last names, and then the addresses?
Look at: preg_match preg_replace explode stristr str_replace Last edited by agent-j; May 31st, 2006 at 23:06. |
|
|||
|
Re: Parsing text
The data in the fields of the database come out in whatever order you determine by the construct of your query. If this is an area you are interested in, then you need to do some basic reading up on databases and SQL. |
![]() |
| Tags |
| parsing, text |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| xml parsing | ktsirig | PHP Forum | 1 | Apr 12th, 2008 16:05 |
| making text field text disapear | Phixon | JavaScript Forum | 4 | Feb 2nd, 2008 07:49 |
| Catch XML parsing Exceptions. | alexgeek | PHP Forum | 0 | Jan 5th, 2008 10:53 |
| XML Parsing Error: Opening ending tag mismatch | bobby198010 | Web Page Design | 11 | Oct 20th, 2007 09:07 |
| Cross-browser XML parsing??? | gohankid77 | Other Programming Languages | 4 | Mar 28th, 2005 17:39 |