Can anyone please help me find a batch file script that automatically removes
TAB characters and replaces them with
61 LEUKOCELL 2 7737 PK25 278 N Y
97 SNAP COMBO PLUS (FELV/FIV) 9906034 PK15 290.82 N Y
I have 90k+ lines of this in text files: can I automate the reformatting with a batch file to this:
97,SNAP COMBO PLUS (FELV/FIV),906034,PK15,90.82,N,Y
You don't need complicated methods to achieve a replacement as simple as this one. The small Batch file below replace all Tabs with comma:
@set @a=0 /* @cscript //nologo //E:JScript "%~F0" < input.txt > output.txt @move /Y output.txt input.txt @goto :EOF */ WScript.Stdout.Write(WScript.StdIn.ReadAll().replace(/\t/g,","));
Save this code with .BAT extension.
@ECHO OFF SETLOCAL SET "sourcedir=U:\sourcedir" SET "destdir=U:\destdir" SET "filename1=%sourcedir%\q34875733.txt" SET "outfile=%destdir%\outfile.txt" ( FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO ( SET "line=%%a"&call :process ) )>"%outfile%" GOTO :EOF :process SET /a "col1=%line:~0,4%" SET "col2=%line:~4,32%" SET "col2=%col2: =%" SET "col2=%col2: =%" SET "col2=%col2: =%" SET "col2=%col2: =%" IF "%col2:~-1%"==" " SET "col2=%col2:~0,-1%" FOR /f "tokens=1-5" %%i IN ("%line:~36%") DO ECHO %col1%,%col2%,%%i,%%j,%%k,%%l,%%m GOTO :EOF
You would need to change the settings of
destdir to suit your circumstances.
I used a file named
q34875733.txt containing your data for my testing.
Produces the file defined as %outfile%
Assuming your layout is fixed-column as described and that there are no characters in the data to which batch shows sensitivity, repeating your 2 lines to a file with 90K+ lines had a run-time of about 7 minutes on my machine.
For each line, assign the line to
line and process by
@ECHO OFF SETLOCAL ENABLEDELAYEDEXPANSION SET "sourcedir=U:\sourcedir" SET "destdir=U:\destdir" SET "filename1=%sourcedir%\q34875733.txt" SET "outfile=%destdir%\outfile.txt" SET "tab= " ( FOR /f "usebackqtokens=1-7delims=%tab%" %%a IN ("%filename1%") DO ( REM detect missing column 3 IF "%%g" == "" (ECHO %%a,%%b,,%%c,%%d,%%e,%%f) ELSE (ECHO %%a,%%b,%%c,%%d,%%e,%%f,%%g) ) )>"%outfile%" GOTO :EOF
Having looked at your source data, it would apper that the columns are aligned using tabs and column 3 is sometimes missing (413 denelan)
Hence - replacement routine (you'd need to reformat the source data 61 LEUKOCELL 2 to the same format as it would appear you use for the rest of the file.
Note that the character between the quotes in the setting of the variable
tab is a Tab not a string of spaces.
So this time, break the line into 7 columns using tab (or sequence of tabs) as separators; assign to %%a..%%g and regurgitate. If column 3 is missing,
%%g will nt be assigned (since there is one column short) so
%%g will appear to be nothing. If this situation is detected, insert an empty field as column 3 (hence
I've assumed that all of the data contains either 7 columns or 6 columns where column 3 is missing.
This should replace all TABS with a comma.
call jrepl "\t" "," /x /f "input-file.txt" /o "output-file.txt"
This uses a native Windows batch script called
Jrepl.bat written by dbenham that uses
jscript to make it very robust and swift.
Place it in the same folder as the batch file, or in a folder that is on the system path.
There is also copy on Dropbox (unblock it after downloading):