"readme-EPD" for "40H-EPD-2025B" Tools & Utilities "40H-EPD" is a collection of Command Prompt utility tools by Norman Pollock (USA) that is free to download for personal use. The tools are primarily for use with "epd" data files in "Windows" or in a "Java Runtime Environment" "40H-EPD" tools and instructions are copyright (c) 2010-2025 by Norman Pollock. All rights reserved. "40H-EPD" can be freely downloaded and used for non-commercial purposes as long as "40H-EPD" is kept intact and no changes are made to any files. Any commercial distribution or commercial use of "40H-EPD" is strictly forbidden. Disclaimer: The "40H-EPD" package is distributed "as is". No warranty or guarantee of any kind is expressed or implied. The user assumes all risks of usage. The author is not responsible for any damage or losses of any kind caused by the use or misuse of the tools or this "readme-EPD" file. Current download sites for "40H Tools": 40hchess.epizy.com nk-qy.info/40h (Thanks to Frank Quisinsky) Please bookmark. Contact: Norman Pollock rc1242@yahoo.com =================================================================== FAQ: 1. What is the purpose of "40H" tools? "40H" tools enable users to do specific chess tasks. 2. What computer skills are needed to use the "40H" tools? The user has to be able to use command-line commands in a "Command Prompt" window. 3. Where are the full usage instructions for each "40H-EPD" tool? Please read the full usage instructions before using a tool. FULL USAGE INSTRUCTIONS are the last section of this readme. 4. What does the term "40H" have to do with chess? 40H is a hexadecimal and computer science number that is equal to the number of squares in a chessboard. 40H = 64 decimal. 5. Do the "40H" tools require Internet access? Only to download the tools. 6. On which platforms will the "40H" tools work? "40H" tools come in 2 formats: "Windows" executables (".exe") and "Java Class Files". Download "40H-Java Class Files" for non-Windows platforms. 7. Are the "40H" tools portable? Yes. You can even save them on a flash drive and use them on different PCs. They are single files and no "dlls" are required. They do not require a "setup" program and they do not affect the registry. They can be removed by simple deletion. 8. Are the "40H" tools 64-bit and do they use multiprocessing? They are each 32-bit tools and they do NOT use multiprocessing. 9. Are the "40H" downloads checked for viruses/malware? "40H" downloads are checked at www.virustotal.com 10. Is the "readme" file included in the download? Yes, and it is also on the website. The version on the website has the latest updates and corrections. =================================================================== OVERVIEW OF TOOLS (FULL INSTRUCTIONS AT THE END OF THIS FILE) Each "40H-EPD" command-line tool consists of a single file. Each tool executes from a command-line in a "Command Prompt" window. The "40H-EPD" tools do not change any input file. Output appears in a new file(s). There are 35 tools and 7 batch scripts in "40H-EPD". Each batch script is paired with one of the tools. 1. "bmOpcode" is used in combination with the chess GUI "Arena 3.5.1" by Martin Blume, and a computer engine for analysis. Together they output "best move" ("bm"), "centipawn evaluation" ("ce"), and "predicted variation" ("pv") opcodes. 2. "epd3fold" / "pgn3fold" is used in combination with the utility tool "PGN-Extract" by David Barnes and the "40H-PGN" tool "numExtract". Together they input a "pgn" file and output those games having a 3-fold position repetition. 3. "epdColor" separates the records based on whether White or Black is the active color (the color that makes the next move). 4. "epdConvert" produces a "pgn" file from an "epd" file. Each "epd" record is put into a "pgn" game shell having a "FEN" tag. 5. "epdDifference" extracts records based on a user-specified simple material difference range between the pieces of the two colors. The user can use the default piece values, or specify custom values. The simple material difference of each record is listed in a separate file. 6. "epdExtra" separates the records of the input "epd" file based on the number of "extra" promoted pieces. An extra promoted piece is a 2nd or more Queen, a 3rd or more Rook, Bishop, or Knight. 7. "epdFaux" removes all faux "en passant" target square notations in an "epd" file. A faux "en passant" target square is one that cannot be attacked, legally or otherwise, by an opponent pawn. 8. "epdFin" / "pgnFin" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the final position. 9. "epdFlip" reverses the colors and does a vertical reflection about the imaginary horizontal line between ranks "4" and "5". The color to move, "castling" rights and "en passant" rights are also reversed. The new "epd" record is logically equivalent to the old "epd" record. 10. "epdImbalance" separates the records into two files. One for records where the opposing sides have different material (imbalanced), and the other for records where the opposing sides have the same material (balanced). 11. "epdInsert" appends new opcodes to the records. 12. "epdInsuff" / "pgnInsuff" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they extract drawn games that end with insufficient checkmating material. 13. "epdKings" extracts records where the Kings are advanced beyond each other. In other words, a record is extracted if the White King's rank (row) is higher than the Black King's rank (row). 14. "epdMask" / "pgnMask" extracts records based on a user-supplied mask structure such as "5rk1/8/8/8/8/8/8/5RK1". 15. "epdMaterial" / "pgnMaterial" extracts records based on a user-specified file "pieces" that lists specific pieces and their specific quantities. 16. "epdMerge" joins 2, 3, 4 or 5 "epd" files by adding one line from each successive input file, then repeating. 17. "epdOccur" lists the number of occurrences of each distinct position in the input "epd" file. It also lists the line numbers where the positions occurred. 18. "epdOrder" sorts the records in descending order based on the "centipawn evaluation" ("ce" opcode) of the position from White's perspective. 19. "epdPawnDiff" separates records based on the difference in the number of pawns of the two sides. 20. "epdPieces" extracts records based on a user-specified number range for the number of chess pieces on the board. 21. "epdPly" / "pgnPly" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the position after a user-specified number of plies. 22. "epdPosition" / "pgnPosition" extracts records based on a user-supplied board position such as 1R6/r1p5/k1K5/8/1P6/4b3/8/8. 23. "epdRandom" randomly rearranges the lines of the input file. The user has the option to only output a user-specified number of lines. 24. "epdRemove" removes a user-specified opcode from the "epd" input file and saves the removed opcode and its values in a second output file. 25. "epdSingle" removes records containing the same position as a previous record in the "epd" file. 26. "epdSort" sorts the lines of an "epd" file alphanumerically. It sorts in either ascending or descending order. 27. "epdToken" uses a user-specified "token_number" to output the token (field) in that position on each line. If that token does not exist, a blank line is output. 28. "epdTrim" removes all opcodes from the "epd" input file, and saves the removed opcodes two different ways - horizontally and vertically. 29. "epdTriple" extracts records where there are 3+ pawns of the same color in the same column. Another output file contains records with 4+ pawns of the same color in the same column. 30. "idOpcode" outputs a file of "id" opcodes, each containing an ID number. 31. "smOpcode" outputs supplied move ("sm") opcodes from a "pgn" file. Each "sm" opcode corresponds to an actual move in the "pgn" file. 32. "txtChar" is useful in determining the presence of unexpected "control" or "extended" characters in a text file. Such characters may cause issues with processing the text file. 33. "txtColumn" outputs a contiguous range of columns in a text file. It uses user-specified starting and ending column numbers. 34. "txtOccur" lists the number of occurrences of each distinct non-blank line. It also lists the line numbers where the lines occurred. 35. "txtSingle" removes any line that is a duplicate of a prior line. The remaining lines are in their original order. =================================================================== "epd" AND "txt" PREFIXES on 40H-EPD Filenames A "txt" prefix indicates that the tool is mostly used on general text files, but is also used on "epd" files (which are also text files). An "epd" prefix indicates that the tool is mostly used on "epd" files. Some "epd" tools ("epdInsert", "epdMerge", "epdRandom", "epdToken", and "epdSort") are also used on general text files. Two tools have both an "epd" AND a "txt" version. That is because they are slightly different. The "epd" version ignores opcodes while the "txt" version processes the entire line. The tools are: "epdOccur" and "txtOccur" "epdSingle" and "txtSingle" =================================================================== BATCH/SCRIPT PGN TOOLS USing A 40H-EPD TOOL pgn3fold.cmd: extracts "pgn" games containing a 3-fold repetition of a position. Uses "pgn-Extract", "epd3fold" and "numExtract". pgnFin.cmd: lists the final position of each "pgn" game in "epd" format. Uses "pgn-Extract" and "epdFin". pgnInsuff.cmd: extracts games ending in a position with insufficient material for a checkmate. Uses "pgn-Extract", "epdInsuff" and "numExtract". pgnMask.cmd: extracts "pgn" games containing a position with a user-specified mask_structure in "epd" format. Uses "pgn-Extract", "epdMask" and "numExtract". pgnMaterial.cmd: extracts "pgn" games containing a position with piece/ quantity values from the user-specified file "pieces". Uses "pgn-Extract", "epdMaterial" and "numExtract". pgnPly.cmd: lists the position in each "pgn" game that occurred after a user-specified ply. Outputs in "epd" format. Uses "pgn-Extract" and "epdPly". pgnPosition.cmd: extracts "pgn" games containing a position in a user-specified file of "epd" positions. Uses "pgn-Extract", "epdPosition" and "numExtract". Full instructions are available in the instructions for the related "40H-EPD" tool. The 7 batch script "pgn" tools are included in the "40H-EPD" download. All files executed within the batch script must either be in the Working Folder or on the System Path. An input data file must be in the Working Folder or be specified with a pathname. =================================================================== INTRODUCTION: Download the "40H-EPD" compressed file. It is packed in "7-zip" format. It can be unpacked using "7-zip", available at: http://www.7-zip.org Unpacking the "40H-EPD" download file results in 35 "Windows" executable ("exe") files, 7 batch script ("cmd") files, and 1 readme file. The "readme-EPD" file is oriented to users of the "Windows" executable files. Users of the "Java" class files have to make adjustments for use in a Java Environment. Each "40H-EPD" tool is a "command-line tool", which means that it executes on a command-line in a "Command Prompt" window. The "40H-EPD" tools were written in "Java" and compiled using "gcj 3.4". All coding is original. Each "40H-EPD" tool consists of a single self-contained file. No external "dlls" are required. Each "40H-EPD" tool is portable and just has to be copied to be installed. It does not need a setup program and it does not write any data to the registry. Unless otherwise indicated, each "40H-EPD" tool inputs an "epd" file. "40H-EPD" tools DO NOT MAKE ANY CHANGES to the input files. Output appears in a new file(s). Output files are pre-named. Users should rename the output file(s) before they are overwritten by the next execution of the tool. "40-EPD" tools assume the records in the "epd" input files are written in Standard Algebraic Notation ("SAN") and adhere to the "EPD Standards" in the "Standard PGN Specification Guide". "EPD Standards" are in the "Standard PGN Specification Guide" at: http://www.saremba.de/chessgml/standards/pgn/pgn-complete.htm and also at http://jchecs.free.fr/pdf/EPDSpecification.pdf The 35 "40H-EPD" tools are: "bmOpcode", "epd3fold", "epdColor", "epdConvert", "epdDifference", "epdExtra", "epdFaux", "epdFin", "epdFlip", "epdImbalance", "epdInsert", "epdInsuff", "epdKings", "epdMask", "epdMaterial", "epdMerge", "epdOccur", "epdOrder", "epdPawnDiff", "epdPieces", "epdPly", "epdPosition", "epdRandom", "epdRemove", "epdSingle", "epdSort", "epdToken", "epdTrim", "epdTriple", "idOpcode", "smOpcode", "txtChar", "txtColumn", "txtOccur", and "txtSingle". The 7 batch/script "pgn" tools are: "pgn3fold", "pgnFin", "pgnInsuff", "pgnMask", "pgnMaterial", "pgnPly" and "pgnPosition". Thanks to Jim Ablett for helping me to get started on this project and for showing me how to compile "Windows" executables. =================================================================== LIMITATIONS: "40H-EPD" tools ONLY execute from a command-line within a Command Prompt. See http://dosprompt.info/ for Command Prompt support. "40H-EPD" tools are subject to the capacities of the maximum array sizes that are specified in their coding. These maximum array sizes are adequate for most "epd" input files. However, if you use a very large "epd" input file, there is a possibility that an overflow error will occur or that execution will take an extended amount of time. "40H-EPD" tools might not perform properly when processing an input "epd" file that is extremely fragmented. Be sure to defragment your hard drive regularly. "40H-EPD" tools might not perform properly when processing an input "epd" file that does NOT completely conform to "EPD" Standards". Some "40H-EPD" tools may take a very long time to process very large "epd" files. You could check the "Task Manager" if you think the tool has stalled. Newer versions of external tools mentioned in this document may or may not work properly with "40H" tools. Also, the availability to download those tools may change. For example, a web site may change to a new address, or even discontinue. Any such event could affect "40H" tools. It is therefore suggested that you continue to keep a copy of any old version of an external tool that works properly with "40H" tools. =================================================================== INSTALLATION, FILES, AND FOLDERS: Create a folder named "40H" if it does not already exist. Extract the download into the "40H" folder. The extraction will unpack the 35 tools into a subfolder named "40H-EPD-2025B". The tools can then be copied/moved to any other folder, preferably one that is already on the System Path. For users who only will be using the "40H-EPD" tools occasionally, the simplest arrangement is to copy the desired tool to the folder where the input file(s) are located and then run in that folder. Likewise, you could copy the input file(s) to the folder where the "40H-EPD" tools are located and then run in that folder. For users who will be using "40H-EPD" tools often, copy/move the "40H-EPD" tools to a folder that is already on the System Path. (Type "path" in a command window to see the System Path.) This way you can run any of the "40H-EPD" tools from any folder. If you do not have such a folder on your System Path, you would first have to create the folder and then edit the Path Variable. Use "search" in "Settings" to find "System Environment Variables". Then click on "Environment Variables", then "Path" in "System Variables", and then "edit". This may vary depending on "Windows" version. Input files are not changed. For extra safety, the user should save all input "pgn" files on another storage medium. Running a "40H-EPD" tool without mentioning all required input files and parameters will list the version number of the tool, the syntax, an example of usage, and the names of the output files. The tools and the input files have to interact. The following 4 items summarize possible arrangements: (1) If the tool and input file(s) are in the same folder, and you are working in that folder, you are good. (2) If the tool is in a folder on the System Path, and you are working in the folder containing the input file(s), you are good. (3) It is best to work in the folder containing the input file(s). This folder is referred to as the "Working Folder". It is also the folder where output files will be created. (4) A pathname is needed for an input file that is NOT in the Working Folder. Output appears in a new file(s). An output file cannot be used as input to the "40H" tool that produced it, unless its filename is changed. Output files are TEMPORARY files. They are created in the Working Folder. Be sure to rename/copy/move any output files that you want to keep. The next execution of the tool in that folder will overwrite the previous output file(s). Do not change an original output file to "read-only" as that will prevent the creating tool from executing in that folder. Many "40H-EPD" output filenames are in the form "out*.epd". Many times the records NOT extracted to "out*.epd" are output to files whose filenames are in the form "exclude*.epd". Sometimes the user is more interested in "exclude*.epd" than in "out*.epd". "40H-EPD" tools accept "UTF-8" encoded "epd" files. However, all "40H-EPD" output files are "Latin-1" encoded. "Latin-1" is the "PGN Standard". =================================================================== EXECUTION: Each tool executes from a command-line in a "Command Prompt" window. The general format for running a tool is: tool_name [data_filename] epd_filename [parameter(s)] Examples: 1. epdImbalance alpha.epd 2. epdInsert inlist alpha.epd (uses data file) 3. idOpcode alpha.epd Openings (uses parameter) After entering the proper command-line, and making sure all files are accessible, press to start execution. Some tools, like "epd3fold" and "epdOccur" take more time to execute large files compared to other "40H-EPD" tools. Output is in a new file(s) in the Working Folder. The original input file is not changed. Be sure to follow the specific instructions for each tool. =================================================================== EXTERNAL TOOLS: Future updates of external tools might or might not work properly with "40H-EPD" tools. It is recommended to always retain a working version. Users need a "PGN Viewer" to see a game or a position. The "PGN Viewer" could also be a GUI (graphical user interface). "PGN-Extract" by David Barnes is a PGN/EPD/FEN utility tool that performs many functions. "PGN-Extract" is free to download. The current download site is: http://www.cs.kent.ac.uk/people/staff/djb/extract.html "Arena" (version 3.5.1) by Martin Blume is a UCI/Winboard Graphical User Interface. "Arena" is free to download. Its current download site is: http://www.playwitharena.com/ =================================================================== BATCH SCRIPTS ("cmd" or "bat" files) Using batch scripts can greatly increase the convenience of the "40H-EPD" suite. It can be used to string together several tools. All files mentioned within the batch script should be in the Working Folder or on the Path. Assuming the following is saved as "pgn3fold.cmd": PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epd3fold temp.epd numExtract numbers %1 Usage: pgn3fold alpha.epd Output: manifest-3f, outZ.pgn =================================================================== OPCODES The following tools add or remove opcodes from all records in an "epd" file: epdInsert: adds a single opcode epdRemove : removes a single opcode epdTrim : removes all opcodes =================================================================== =================================================================== FULL USAGE INSTRUCTIONS: =========================(1) bmOpcode ============================= "bmOpcode" is used in combination with the chess GUI "Arena 3.5.1" by Martin Blume, and a computer engine for analysis. Together they output "best move" ("bm"), "centipawn evaluation" ("ce"), and "predicted variation" ("pv") opcodes. By using "epdInsert", and the output file "bmlist", "bm" and "ce" opcodes can be appended to the "epd" input file. By using "epdInsert", and the output file "pvlist", "pv" opcodes can be appended to the "epd" input file. "bmlist" and "pvlist" do not contain blank lines. If the "epd" input file contains blank line(s), the user will have to use a text editor to insert blank line(s) into "bmlist" and "pvlist" to coordinate with the "epd" file. In "Arena 3.5.1", use "Automatic Analysis" under "Engines". The engine you are going to use has to be "loaded", and "configured" if necessary. Only use one computer engine. ------ Configuration for "Arena 3.5.1": Within "Engine/Automatic Analysis/Source" set "Direction" to "forward". Within "Engines/Automatic Analysis/Engines" choose your engine and set the "Level". Within "Engine/Automatic Analysis/Output": a) Check that the output file name is "Analyses.log". Previous versions of "Arena" and "bmOpcode" used a different filename. "bmOpcode" looks for a file named "Analyses.log". b) set "Protocol file" to "Overwrite". Within "Engine/Automatic Analysis/Options" set Analysis Lines Minimum search depth to "1". Within "Options/Appearance/OtherSettings/Chess": set "Values always from white's point of view." This is to conform to "PGN Standards". ----- "bmOpcode" will input "Analyses.log" and create output files "bmlist" and "pvlist". "bmlist" contains "bm" and "ce" opcodes. "pvlist" contains "pv" opcodes. Line numbers in "bmlist" and "pvlist" match the corresponding record line numbers in the "epd" input file. "bmOpcode" only outputs one "bm" value per position. Additional "bm" values can be added by using a text editor. "bmOpcode" is dependent upon the engine producing a large amount of descriptive output. Since engines differ, "bmOpcode" performs better with some engines than with others. A value of "..." for a "bm", "ce" or "pv" opcode indicates that "bmOpcode" was not able to process that record. Sometimes running "bmOpcode" again will be successful. But there will be times when "bmOpcode" will not be able to be successful due to unusual output in "Analyses.log". To append the "bm" and "ce" opcodes in "bmlist", or the "pv" opcodes in "pvlist" to the "epd" input file: 1. << Arena outputs analysis.log from beta.epd >> 2. bmOpcode Analyses.log 3. << User checks bmlist against beta.epd >> 4. copy bmlist inlist OR copy pvlist inlist 5. epdInsert inlist beta.epd 6. << Output is outN.epd >> "Analyses.log" must be located in the Working Folder and cannot be referenced using a pathname. "bmOpcode" ONLY outputs one "bm" value, although occasionally Arena finds more than 1 "bm" (best move) for a position. To locate such additional "bm" moves, search "analyses.log" and search for "solutions" with the "s" at the end. Then using a text editor, add the additional "bm" moves into "bmlist". Do not insert any commas or other punctuation. For example: Suppose the original bmlist had the line: bm g2-g3; ce +M2; and suppose there is another equally good "bm" (best move): Rg6-g3. Then you would manually insert it into bmlist with a text editor: bm g2-g3 Rg6-g3; ce +M2; Syntax: bmOpcode Analyses.log Example: bmOpcode Analyses.log Output: bmlist, pvlist Comments: 1. A "ce" opcode gives a relative evaluation of a position, from the White point of view, based on "centipawns" which are 1/100 of a pawn. This is different, by a factor of 100, from the common evaluation based on one pawn. For example, a common evaluation of "+1.35" pawns is equivalent to a "ce" evaluation of "135". 2. A "pv" opcode gives a variation of what might happen with maximum play by both colors, starting with the "best move". It is the variation that the computer engine believes to be the best series moves for each side, at the time the engine stopped analyzing. 3. "bm", "ce" and "pv" values usually change as a computer engine continues to analyze a position. 4. Older versions of Arena (versions 1 and 2) used the spelling "Analysis.log" for the output log file. You would have to change the name to "Analyses.log" for use with "bmOpcode". =====================(2) epd3fold / pgn3fold ====================== "epd3fold" is used in combination with the utility tool "PGN-Extract" by David Barnes and the "40H-PGN" tool "numExtract". Together they input a "pgn" file and output those games having a 3-fold position repetition. A position is "repeated" if all pieces of the same kind and color are on identical squares, and all possible moves are the same. For computer processing, "epd3fold" interprets this to mean that the first four tokens in their respective "epd" records are identical. Although the two interpretations are ever so slightly different, the possible difference in output is microscopically small. For example, if one position has a "castling" (or "en passant") permission that is temporarily not executable, it will have the same possible moves as a similar position that only differs by not having that "castling" (or "en passant") permission. A 3-fold position repetition is very hard for humans to notice if the positions do not occur in successive moves. "PGN-Extract", with the "-Wepd" and "--nofauxep" parameters, precedes "epd3fold" in the sequence of execution. It produces an "epd" file from a "pgn" file and also removes faux "en passant" notations. "epd3fold" then produces the game numbers of games having a 3-fold repetition in the "epd" file from the previous step. Those numbers are listed in the output file "numbers". Also, repeated positions, the game numbers and numbers of occurrences are listed in the output file "manifest-3f". "numExtract" follows "epd3fold" in the sequence of execution. It uses the file "numbers" to produce the output file "outZ.pgn". "outZ.pgn" contains the "pgn" games having a 3-fold repetition. No game is listed more than once even if it has more than one 3-fold position repetition. To demonstrate what could happen if faux "en passant" notations are NOT removed before using "epd3fold", consider the following game fragment: 1. e4 e5 2. Nf3 Nc6 3. Ng1 Nb8 The positions after move #1 and #3 are identical. However, afer move #1, the "epd" has a faux target square "e6". rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq e6 rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq - This difference would cause "epd3fold" to INCORRECTLY evaluate the two positions as NOT being the same. But the fact is that all their pieces are of the same kind and color, are on identical squares, and all possible moves are the same. Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epd3fold temp.epd numExtract numbers filename.pgn Output: manifest-3f, outZ.pgn Comments: 1. Because "epd3fold" with "PGN-Extract" requires three tools to be executed consecutively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgn3fold.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epd3fold temp.epd numExtract numbers %1 Syntax: pgn3fold filename.pgn Example: pgn3fold beta.pgn 2. "pgn3fold.cmd" is included in the "40H-EPD" download. 3. Using the tool "gameNum" on the input "pgn" file before using "pgn3fold" is recommended. 4. "epdFaux" also removes faux "epd" target squares. 5. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(3) epdColor ============================= "epdColor" separates the records based on whether White or Black is the active color (the color that makes the next move). "outW.epd" contains the records having White move next and "outB.epd" contains the records having Black move next. Syntax: epdColor filename.epd Example: epdColor alpha.epd Output: outW.epd, outB.epd Comments: 1. "epdColor" is often used before using other tools in order to keep results separated by active color. =========================(4) epdConvert =========================== "epdConvert" produces a "pgn" file from an "epd" file. Each "epd" record is put into a "pgn" game shell having a "FEN" tag. Since the conversion is from an "epd" file, not a "fen" file, the values for the halfmove clock and the fullmove counter are not available. Therefore the default values of "0" and "1" are used. The output file is "outEPD.pgn". If a best move ("bm") or predicted variation ("pv") opcode is present, the "bm" move or the "pv" moves will be output. If both are present, only the "pv" moves will be output. Usually the first move of the "pv" is the "bm" move. If more than one "bm" value is present, only the first one will be processed. If an "id" opcode is present, it is output as an "ID" tag. Other opcode data is NOT carried forward to the "pgn" file. Each game in the output is presumed to have an undetermined result ("*"). If you suspect that a checkmate or stalemate is possible due to the "pv" or "bm" data, you can use "PGN-extract" with the "--fixresulttags" option to correct the results. For example: pgn-extract --fixresulttags -s -oout.pgn outEPD.pgn The file "out.pgn" will have the correct results for games ending in a checkmate or stalement. "PGN-extract" also checks the legality of the moves and converts LAN to SAN. The formatting in "outEPD.pgn" can be improved using "trim" from "40H-PGN". Syntax: epdConvert filename.epd Example: epdConvert alpha.epd Output: outEPD.pgn Comment: 1. Users who do NOT want to have the "bm" or "pv" moves output, should use "epdRemove" or "epdTrim" before using "epdConvert". 2. A "pgn" with a "FEN" tag is not useful if you are building an opening book. =========================(5) epdDifference ======================== "epdDifference" extracts records based on a user-specified simple material difference range between the pieces of the two colors. The user can use the default piece values, or specify custom values. The simple material difference of each record is listed in a separate file. "Simple material difference" is defined here as the difference between the total of the piece values of the active color minus the total of the piece values of the non-active color. The default piece values are Queen = 9, Rook = 5, Bishop = 3, Knight = 3 and Pawn = 1. The user can specify other values. "epdDifference" uses the perspective of the active color (the color making the next move). So, if the simple material difference is 3, it means that the total simple piece values of the active color exceed that of the non-active color by 3 Pawn units. Likewise, a simple material difference of -3 Pawn units means that the total simple piece value of the non-active color exceeds those of the active color by 3 Pawn units. The default simple material piece values are: Queen = 9 Pawn units. Rook = 5 Pawn units, Bishop = 3 Pawn units, Knight = 3 Pawn units, Pawn = 1 Pawn unit. The King is not given a piece value because each color always has exactly one King, and that balances out. Secondly, any piece value for the King would have to be infinite because of the King's infinite importance. The user has the option to set his own simple material piece values by specifying those values on the command line, after specifying the range. The user must specify the values in the order of Queen, Rook, Bishop and Knight. All 4 values must be specified even if it is a default value. The user does not have to specify a piece value for a Pawn because the Pawn is the unit of comparison. For example, if the user wants the Queen to be worth 10 Pawn units, and the Bishops to be worth 3.5 Pawn Units, and the Rook and Knight to retain their default values, the user would state the following: 10 5 3.5 3 Notice that there is a space between the values but there is no punctuation. The simple material difference is then calculated by taking the total simple piece value of the active color minus the total simple piece value of the non-active color. For example, 3r2q1/5pk1/6p1/6P1/7Q/8/6K1/7R w - - has total simple piece values of: White piece values: 1Q + 1R + 1P = 16 Black piece values: 1q + 1r + 2p = 17 Since White is the active color, the simple material difference = -1. The simple material difference is a very rough estimator of which color has the advantage in the game. It does NOT take into account the many nuances of each position such as threats, weaknesses, mobility, king protection, and impending captures. To run "epdDifference", the user has to specify a RANGE for the simple material difference. The minimum value is followed by the maximum value, but the minimum value can equal the maximum value. These values are inclusive. For example: epdDifference alpha.epd 2 5 will extract records from alpha.epd which have a simple material difference from 2 to 5 Pawn units, inclusive. An example where the user chooses the piece values: epdDifference alpha.epd 2 5 9.5 5 3.1 3 will extract records the same as in the previous example, except that new piece values are specified on the command line in the following order: Queen = 9.5 Pawn units, Rook = 5 Pawn units, Bishop = 3.1 Pawn units, Knight = 3 Pawn Units, The values of all four piece types must be listed if you want to change the piece values. And the order must be Queen, Rook, Bishop, and Knight. Other Usage examples: epdDifference alpha.epd 0 0 will extract records where the simple material difference is 0. Note that this does not mean that the two colors have the same pieces. epdDifference alpha.epd 2 2 The extracted records will have a simple material difference of exactly 2 Pawn units. The total simple material difference of the pieces of the active color exceeds those of the non-active color by exactly 2 Pawn units. epdDifference alpha.epd 2 6 The extracted records will have a simple material difference ranging from 2 Pawn units to 6 Pawn units inclusive. epdDifference alpha.epd -3 1 The extracted records will have a simple material difference ranging from -3 Pawn units to 1 Pawn unit inclusive. epdDifference alpha.epd -5 -2 The extracted records will have a simple material difference ranging from -5 Pawn units to -2 Pawn units inclusive. The output file "outM.epd" contains the records that are in the range. The output file "outSMD.epd" contains a list of all the records along with a new opcode "smd" that specifies the simple material difference for that record. The records are listed in the original input order. Syntax: epdDifference filename.epd min_diff max_diff [Qval Rval Bval Nval] Examples: epdDifference alpha.epd 2 4 epdDifference alpha.epd -3 5 epdDifference alpha.epd 2.1 6 9.5 5 3.3 3.2 Output: outM.epd, outSMD.epd, excludeM.epd Comments: 1. There are many opinions by chess experts concerning the best relative piece valuations. See "Wikipedia" on relative values of chess pieces for further information. 2. The user-specified numbers for the range and the custom material numbers can be decimal numbers. For example: epdDifference alpha.epd 2.3 4.4 9.6 5.2 3.1 2.8 where 2.3 and 4.4 are the range numbers, and 9.6, 5.2, 3.1 and 2.8 are the custom material numbers for a Queen, a Rook, a Bishop and a Knight. =========================(6) epdExtra ============================= "epdExtra" separates the records of the input "epd" file based on the number of "extra" promoted pieces. An extra promoted piece is a 2nd or more Queen, a 3rd or more Rook, Bishop, or Knight. "epdExtra" separates the records of an "epd" file into 3 files: (a) "outX0.epd" : neither side has an "extra" promoted piece. (b) "outX1.epd" : one or both sides has exactly one "extra" promoted piece and neither side has more than one "extra" promoted piece. (c) "outX2.epd" : one side or both sides have 2 or more "extra" promoted pieces. When a Pawn is promoted, it does not always result in an "extra" promoted piece. For example, if one side has previously lost its Queen, a new Queen by promotion is not "extra" because that side now has just 1 Queen. If another Pawn is promoted to a Queen, that side will have 2 Queens and therefore one "extra" promoted piece. Examples: 1. White and Black do not have any "extra" promoted pieces. The record belongs in "outX0.epd". Zero "extra" promoted pieces. 2. White has 2 Queens. The record belongs in "outX1.epd". White has 1 "extra" promoted piece. 3. White has 2 Queens and Black has 2 Queens. This record belongs in "outX1.epd". White and Black EACH have 1 "extra" promoted piece. 4. White has 2 Queens and Black has 3 Rooks. The record belongs in "outX1.epd". White and Black EACH have 1 "extra" promoted piece. 5. White has 3 Queens and Black has 2 Queens. The record belongs in "outX2.epd". White has 2 "extra" promoted pieces in total. 6. Black has 4 Rooks. The record belongs in "outX2.epd". Black has 2 "extra" promoted pieces in total. 7. White has 3 Knights and 3 Bishops. The record belongs in "outX2.epd". White has 2 "extra" promoted pieces in total. Syntax: epdExtra filename.epd Usage: epdExtra alpha.epd Output: outX0.epd, outX1.epd, outX2.epd =========================(7) epdFaux ============================== "epdFaux" removes all faux "en passant" target square notations in an "epd" file. A faux "en passant" target square is one that cannot be attacked, legally or otherwise, by an opponent pawn. If not removed from "epd" records, the faux "en passant" target squares can cause output errors. An "en passant" target square is created immediately after a pawn advances two squares from its starting position. The target square is the square where the pawn would be if it had only moved one square. The opposing player is allowed to capture the pawn at the target square with one of his pawns, but only on his next move. Most of the time when an "en passant" target square is formed, it is IMPOSSIBLE for the opposing side to capture there with one of his pawns. When the capture is impossible, the target square is said to be "faux" (or "fake", "false", "spurious"). Removal of faux "en passant" target squares is desirable because an "epd" record with a "faux" target square is considered to be a different position than an "epd" record without a target square, even though the positions are the same. This difference interferes with searching for "epd" records of the same position. Faux "en passant" target squares have no redeeming value and should not have been permitted in the "epd" specifications. But since they exist, there needs to be a tool to remove them. There are three different situations that cause a faux target square. The first, and most common, is when the opponent does not have a pawn in position to capture. The two other situations are when the capture at the target square would be an "illegal move". One "illegal move" situation occurs when the opponent's pawn, in position to capture at the "en passant" target square, is pinned to its OWN king. The pawn cannot capture because its king would go into check. A second "illegal move" situation occurs when the pawn that moved two squares uncovers a "discovered" check on the opponent's king. The opponent's pawn, in position to capture at the "en passant" target square, cannot capture because its king must first get out of check, and after that, the "en passant" permission expires. In all three faux situations, "epdFaux" removes the "en passant" target square notation and replaces it with "-". In each of the following examples, White's last move was "e2-e4", creating a faux "en passant" target square notation at "e3": (1) No pawn in position to capture: rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 (2) The "capturing" pawn is pinned to its king: rnbq1bnr/ppp1pppp/5k2/8/P2pP2P/1P4P1/1BPP1P2/RN1QKBNR b KQ e3 (3) The two-square pawn move uncovered a "discovered" check: rnbq1bnr/ppp1pppp/8/8/2kpP2P/P5PN/RPPP1P1R/1NBQKB2 b - e3 The opposing player is NOT required to make an "en passant" pawn capture unless it is his only move to get his king out of check. The output file "outFaux.epd" contains the corrected records plus the other records. The output file "manifest-fx" contains a list of records that were corrected along with their line numbers. Syntax: epdFaux filename.epd Examples: epdFaux alpha.epd Output: outFaux.epd, manifest-fx Comments: 1. It is possible for two pawns to be in position to make a capture at the target square. This can only occur if the target square is in files "b" through "g". 2. A pawn that has just advanced two squares can be captured at either its "current" square or at its "en passant" target square. A capture at its "en passant" target square must be made by an opponent's pawn on the opponent's next move, while a capture at the "current" square can be made by any opponent piece at any time. 3. An "en passant" permission expires after the opponent's next move. This is unlike the "castling" permission for a rook which expires after the rook or its king moves. =========================(8) epdFin / pgnFin ====================== "epdFin" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the final position. Optionally, "epdFin" can output a user-specified number of records from the end of each game. "PGN-Extract" outputs an intermediate "epd" file, "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. Then "epdFin" inputs "temp.epd" and outputs the last record of each game to "outF.epd". If the user specifies the optional "record_number", then that number of records will be output from the end of each game. If a game has fewer records than the user-specified "record_number", then all of its records will be output. Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdFin temp.epd 3 will output the final 3 "epd" records from each game. If 2 or more records from the end of each game are requested, then the output file inserts a blank line between each output set. Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdFin temp.epd [record_number] Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdFin temp.epd Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdFin temp.epd 20 Output: outF.epd When using "epdFin" by itself: Syntax: epdFin filename.epd [record_number] Output: outF.epd Comments: 1. Because "epdFin" requires two tools to be executed consecutively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnFin.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdFin temp.epd %2 Syntax: pgnFin filename.pgn [record_number] Example: pgnFin beta.pgn Example: pgnFin beta.pgn 5 2. "pgnFin.cmd" is included in the "40H-EPD" download. 3. The output file "outF.epd" can be used by "epdOccur" to see which games have the same final position. In this case, you should number comment each game using "gameNum" from "40H-PGN". 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(9) epdFlip =============================== "epdFlip" reverses the colors and does a vertical reflection about the imaginary horizontal line between ranks "4" and "5". The color to move, "castling" rights and "en passant" rights are also reversed. The new "epd" record is logically equivalent to the old "epd" record. r2qkb1r/pp1n1ppp/2p2n2/3pp3/6b1/3P1NP1/PPPNPPBP/R1BQ1RK1 w kq e6 r1bq1rk1/pppnppbp/3p1np1/6B1/3PP3/2P2N2/PP1N1PPP/R2QKB1R b KQ e3 To get a visual comparison, open 2 instances of your "PGN viewer". Put the original "epd" record in one instance, and put the "flipped" "epd" record in the other. Be sure both "PGN Viewers" have the same "View". Compare. There are 3 output files: "outZ.epd", "outZB.epd", and "outZW.epd". Output files of "epdFlip" remove most opcodes from the input file. However, the "bm" (best move) opcode OR the "am" (avoid move) opcode is retained provided that it is the first opcode. "outZ.epd" contains the flipped "epd" records. "outZB.epd" contains the original "Black to move" records and the flipped "White to move" records. All records in "outZB.epd" have "Black to move". "outZW.epd" contains the original "White to move" records and the flipped "Black to move" records. All records in "outZW.epd" have "White to move". Syntax: epdFlip filename.epd Example: epdFlip alpha.epd Output: outZ.epd, outZB.epd, outZW.epd Comments: 1. If the "bm" or "am" opcode is retained, only its first value is retained. 2. Fixed opcodes can be restored without adjustment using "40H-EPD" tools "epdRemove" and "epdInsert". =========================(10) epdImbalance ========================= "epdImbalance" separates the records into two files. One for records where the opposing sides have different material (imbalanced), and the other for records where the opposing sides have the same material (balanced). "outIM.epd" contains the records where the two sides DO NOT have the same set of pieces (imbalanced) and "excludeIM.epd" contains the records where the two sides have the same set of pieces (balanced). Syntax: epdImbalance filename.epd Example: epdImbalance alpha.epd Output: outIM.epd, excludeIM.epd Comment: 1. No distinction is made between bishops that move on different colored squares ("light"/"dark"). =========================(11) epdInsert =========================== "epdInsert" appends new opcodes to the records. "epdInsert" can be used in conjunction with "bmOpcode" and "idOpcode". New opcodes are first put into a text file "inlist". Sample "inlist" file: bm a4; bm Be2; c1 stalemate; The above 3 opcodes will be appended to records 1-2-3 of the "epd" input file. The user MUST check "inlist" for accuracy and line number. The new opcodes are attached to the records in the "epd" input file with the same line numbers. Blank lines are permitted in "inlist" and are necessary if the corresponding records in the "epd" input file are NOT to be changed. "inlist" must be located in the Working Folder and cannot be referenced using a pathname. Syntax: epdInsert inlist filename.epd Example: epdInsert inlist alpha.epd Output: outN.epd Comment: 1. "epdInsert" can be used with any "txt" file. =========================(12) epdInsuff / pgnInsuff =============== "epdInsuff / pgnInsuff" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they extract drawn games that end with insufficient checkmating material. There are 4 piece combinations that are insufficient for a checkmate. They are (1) Kk, (2) KBk / Kkb, (3) KNk / Kkn, and (4) KNNk / Kknn. In each of the 4 piece combinations, no additional pieces are on the chessboard. "epdInsuff" extracts all positions from the input "epd" file that have one of the four combinations. These "epd" positions are extracted to two files. The first file is a general file for any insufficient material position, and the other file is specific to the type of piece combination. There are five output files for the extracted "epd" positions: "outUT.epd" contains any position that has one of the four piece combinations (with no additional pieces). "outU0.epd" contains any position that has the "Kk" piece combination (with no additional pieces). "outU1.epd" contains any position that has the "KBk / Kkb" piece combination (with no additional pieces). "outU2.epd" contains any position that has the "KNk / Kkn" piece combination (with no additional pieces). "outU3.epd" contains any position that has the "KNNk / Kknn" piece combination (with no additional pieces). "epdInsuff" also outputs five "number" files that contain the game numbers of the games in the input "pgn" file that contain the "epdInsuff" output positions. These "number" files can be used by the "40H-PGN" tool to extract those games from the input "pgn" file. The "number" files are: "numsT" contains the game numbers of the "pgn" games containing an insufficient material piece combination (with no additional pieces). "nums0" contains the game numbers of the "pgn" games containing a "Kk" piece combination (with no additional pieces). "nums1" contains the game numbers of the "pgn" games containing a "KBk / Kkb" piece combination (with no additional pieces). "nums2" contains the game numbers of the "pgn" games containing a "KNk / Kkn" piece combination (with no additional pieces). "nums3" contains the game numbers of the "pgn" games containing a "KNNk / Kknn" piece combination (with no additional pieces). "epdInsuff" can be used with "pgn-Extract" by David Barnes and "numExtract" from "40H-PGN" to input a "pgn" file and output games that end in a position with insufficient material to checkmate. Because a player has to claim a "draw", games with insufficient checkmating material can (needlessly) continue if no "draw" is claimed. In such a situation, it is possible for the piece combination to be reduced. For example, "KNNk" can reduce to "KNk" and then to "Kk". ---------- When using "epdInsuff" by itself: Usage: epdInsuff alpha.epd Output: outUT.epd, outU0.epd, outU1.epd, outU2.epd, outU3.epd numsT, nums0, nums1, nums2, nums3 --------- When using "epdStuff" with "PGN-Extract": Usage: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdInsuff temp.epd copy numsT numbers numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn, numsT, nums0, nums1, nums2, nums3, outUT.epd, outU0.epd, outU1.epd, outU2.epd, outU3.epd, temp.epd, -------- Comments: 1. Since you need to use several steps to extract "pgn" games containing a position with insufficient material, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnInsuff.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdInsuff temp.epd copy numsT numbers numExtract numbers %1 Usage: pgnInsuff beta.pgn Output: outZ.pgn, excludeZ.pgn The output file "outZ.pgn" contains those games having one or more positions with insufficient material for a checkmate. Multiple positions are possible if the game needlessly continues. No game is output more than once. 2. Sometimes players keep on playing after reaching a position where there is insufficient material to checkmate. This can cause a game to have multiple positions with the same insufficient material. It can also cause a game to have more than one piece combination with insufficient material. For example, a "KNNk" game can lead to a "KNk" game, which in turn can lead to a "Kk" game. 3. "pgnInsuff.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(13) epdKings ============================ "epdKings" extracts records where the Kings are advanced beyond each other. In other words, a record is extracted if the White King's rank (row) is higher than the Black King's rank (row). In the opening position, the White pieces are on Rank 1, and the Black pieces are on Rank 8. An example of the type of position being extracted is the White King on Rank 5 and the Black King on Rank 4. "outG.epd" contains the extracted records. "excludeG.epd" contains the remaining records. Syntax: epdKings filename.epd Example: epdKings alpha.epd Output: outG.epd, excludeG.epd =========================(14) epdMask / pgnMask =================== "epdMask" / "pgnMask" extracts records based on a user-supplied mask structure such as "5rk1/8/8/8/8/8/8/5RK1". The extracted records will contain the specified pieces at the specified squares. Squares not specified may contain a piece or may be empty. The mask structure is the same as the first field of an "epd" specification. A blank square cannot be specified in the mask structure. The output file is outK.epd. "manifest-ms" outputs the user-specified mask structure and each record in "outK.epd". "manifest-ms" also gives the line number of the "epd" record for "epd" input, and the game and ply numbers for "pgn" input. "epdMask" can also be used in combination with "PGN-Extract" by David Barnes. "numbers" lists the game numbers of the output games, and is only needed when "PGN-Extract" is used. "numbers" is an input file for "numExtract". ------- When using "epdMask" by itself: Syntax: epdMask filename.epd mask_structure Example: epdMask alpha.epd 2kr4/8/8/8/8/8/8/2KR4 Output: outK.epd, manifest-ms, numbers ------- To extract games from a "pgn" file that contains a position with the user-specified mask structure", you would execute these commands: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMask temp.epd mask_structure numExtract numbers filename.pgn "PGN-Extract" outputs "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. "epdMask" inputs "temp.epd" and the user-specified mask structure. Then each "epd" record containing the structure is output to "outK.epd". Also, the game number is output to the file "numbers". "numExtract" inputs the original "pgn" file and the file "numbers" and extracts those games having a position matching the structure. No game is output more than once. ------- When using "epdMask" in combination with "PGN-Extract": Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMask temp.epd mask_structure numExtract numbers filename.pgn Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdMask temp.epd 2kr4/8/8/8/8/8/8/2KR4 numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn The output file "outZ.pgn" contains those games having a position containing the user-specified mask structure. ------- Comments: 1. "epdMask" output matches the user-specified mask structure MINIMALLY, which means that additional pieces can be present. 2. Since You need to use several steps to extract "pgn" games containing a position with the user-specified mask structure, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnMask.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdMask temp.epd %2 numExtract numbers %1 Syntax: pgnMask filename.pgn mask_structure Example: pgnMask beta.pgn 2kr4/8/8/8/8/8/8/5RK1 Output: outZ.pgn, excludeZ.pgn 3. "pgnMask.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(15) epdMaterial/pgnMaterial ============= "epdMaterial" extracts records based on a user-specified text file "pieces" that lists piece/quantity values. "epdMaterial" can also be used in combination with "PGN-Extract" by David Barnes. To create the text file "pieces", use the "epd" notation for the different pieces followed by a space and then the quantity. Each piece must be on a separate line. Kings are not listed since there must be 1 for each side. For example, "pieces" could be: N 2 R 0 Q 1 b 0 r 2 p 4 q 0 The above piece/quantity values require an extracted record to have (in addition to the 2 kings) exactly 2 White knights ("N"), 0 White rooks ("R"), 1 White queen ("Q"), 0 Black bishops ("b"), 2 Black rooks ("r"), 4 Black pawns ("p") and 0 Black queens ("q"). Also the above value list does not have values for White pawns ("P"), White bishops ("B"), or Black knights ("n"). These pieces can be present in any legal quantity. Pieces and their quantities in "pieces" can be listed in any order. The use of the file "pieces" allows the user to specify, or NOT specify, an exact quantity for each of the 10 types of pieces: (P, N, B, R, Q, p, n, b, r, q). The number of blank squares cannot be specified. When using "epdMaterial" by itself: Syntax: epdMaterial pieces filename.epd Example: epdMaterial pieces alpha.epd Output: outV.epd, excludeV.epd, numbers The output file "outV.epd" extracts those records in the input "epd" file whose material is specified in "pieces". "excludeV.epd" contains the remaining records. "numbers" lists the game numbers of the output games. This is only used when "PGN-Extract" is used. "numbers" is an input file for "numExtract". To extract games from a "pgn" file that contains a position with the material specified in "pieces", you would execute these commands: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMaterial pieces temp.epd numExtract numbers filename.pgn "PGN-Extract" outputs an "epd" file, "temp.epd", which contains ALL the positions ("epd" records) encountered in the input "pgn" file. A blank line separates successive games. "epdMaterial" inputs the user-specified text file "pieces" and the file "temp.epd". It then outputs each record having the material specified in "pieces" to "outV.epd". It also outputs a list of game numbers that produced the extracted records to the file "numbers". "numExtract" inputs the file "numbers" and the original "pgn" file and extracts games containing a position having the material specified in "pieces". No game is output more than once. When using "epdMaterial" in combination with "PGN-Extract": Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMaterial pieces temp.epd numExtract numbers filename.pgn Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdMaterial pieces temp.epd numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn Comments: 1. "pieces" is case-sensitive. 2. Since several steps are needed to extract "pgn" games containing a position with the user-specified pieces, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnMaterial.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %2 epdMaterial pieces temp.epd numExtract numbers %2 Syntax: pgnMaterial pieces filename.pgn Example: pgnMaterial pieces beta.pgn Output: outZ.pgn, excludeZ.pgn 3. "pgnMaterial.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(16) epdMerge ============================ "epdMerge" joins 2, 3, 4 or 5 "epd" files by adding one line from each successive input file, then repeating. The output file is "outMG.epd". Files do NOT have to have the same number of lines. Example: Suppose fileA.epd has 4 lines, fileB.epd has 2 lines and fileC.epd has 3 lines. Using the command: epdMerge fileA.epd fileB.epd fileC.epd The output file "outMG.epd" will be: line1 from fileA.epd line1 from fileB.epd line1 from fileC.epd line2 from fileA.epd line2 from fileB.epd line2 from fileC.epd line3 from fileA.epd line3 from fileC.epd line4 from fileA.epd Syntax: epdMerge file1.epd file2.epd [file3.epd file4.epd file5.epd] Examples: epdMerge alpha.epd beta.epd epdMerge alpha.epd beta.epd gamma.epd epdMerge alpha.epd beta.epd gamma.epd delta.epd epdMerge alpha.epd beta.epd gamma.epd delta.epd epsilon.epd Output: outMG.epd Comments: 1. Do not confuse merging files with concatenation. Concatenation joins files together by linking them together in a series. 2. In "Windows", to concatenate multiple files into one file, use the "copy" or "xcopy" command in the command line. =========================(17) epdOccur ============================== "epdOccur" lists the number of occurrences of each distinct position in the input "epd" file. An "epd" position consists of the first 4 tokens of an "epd" record. "epdOccur" also lists the line numbers where the positions occurred. Positions having the same "piece placement", "side to move", "castling rights' and "en passant" target square values, are considered to have the same position. All four items must agree. Records with the same position can have different opcodes. A record with a faux "en passant" target square will have a different position compared to the same record not having a target square. Therefore the user should remove faux "en passant" target squares before using "epdOccur". The tool "epdFaux" can be used to remove them. The output files are "outL1.epd" and "outL2.epd". "outL1.epd" contains opcode "c0" which lists the number of occurrences of the position. "outL2.epd" also contains the opcode "c1" which lists the line numbers of the occurrences. Original opcodes are omitted in the output files. Example of an output line in "outL2.epd": 3Q4/p3b1k1/2p2rPp/2q5/4B3/P2P4/8/6RK w - - c0 2; c1 line(s): 467 536; c0 indicates two occurrences of the position and c1 indicates that they are on lines 467 and 536. If you want to remove duplicate "epd" positions, then you should use "epdSingle". Syntax: epdFaux filename.epd epdOccur outFaux.epd Output: outL.epd, outL2.epd Comments: 1. Removing faux "en passant" target squares from the "epd" input file is necessary to ensure full and accurate output from "epdOccur". This can be accomplished by first using "epdFaux". 2. "txtOccur" lists the occurrences of full lines whereas "epdOccur" lists the occurrences of records containing the same position. "epdOccur" ignores opcodes. 3. "epdOccur" ignores blank lines and does not output blank lines. =========================(18) epdOrder ============================= "epdOrder" sorts the records in descending order based on the "centipawn evaluation" ("ce" opcode) of the position from White's perspective. The "ce" opcode evaluates the strength of the position from the viewpoint of the White pieces. The record with the highest "ce" value (the strongest position for White) is listed first and the record with the lowest "ce" value (the weakest position for White) is listed last. "ce" opcode values are expressed in "centipawns". A "centipawn" is one-hundredth of a pawn. Examples: "137" and "-137". A "ce" value of 137 means that White is evaluated to be winning by the equivalent of 1.37 pawns. Similarly, a "ce" value of -137 means that White is evaluated to be losing by the equivalent of 1.37 pawns. In addition, there are "ce" values that start with "+M" and "-M". Examples: "+M5" and "-M5". These values indicate the number of moves to checkmate with ACCURATE play. A value of "+M2" is rated higher than "+M5" because White can mate in fewer moves. Likewise "-M5" is rated higher than "-M2" because White gets mated in a greater number of moves. The "M" values are technically not part of "EPD Standards" but are commonly used as they are easily converted to and from the numerical values that "ce" uses. Syntax: epdOrder filename.epd Example: epdOrder alpha.epd Output: outD.epd =========================(19) epdPawnDiff ========================= "epdPawnDiff" separates records based on the difference in the number of pawns of the two sides. "outPD.epd" contains the records where the absolute value of the difference in the number of pawns of the two sides equals or exceeds a user-specified number (1-8). "excludePD.epd" contains the remaining records. The optional "min_pawn_difference" parameter must be a number from "1" to "8". For example: epdPawnDiff alpha.epd 1 "outPD" contains records where the absolute value of the difference in the number of pawns is "1" or more, and "excludePD.epd" contains the remaining records. epdPawnDiff alpha.epd 3 "outPD" contains records where the absolute value of the difference in the number of pawns is "3" or more, and "excludePD.epd" contains the remaining records. Syntax: epdPawnDiff filename.epd min_pawn_difference Examples: epdPawnDiff alpha.epd 1 epdPawnDiff alpha.epd 5 Output: outPD.epd, excludePD.epd =========================(20) epdPieces =========================== "epdPieces" extracts records based on a user-specified number range for the number of chess pieces on the board. Usage examples: epdPieces alpha.epd 10 15 outputs "epd" records containing 10 to 15 pieces inclusive. epdPieces alpha.epd 11 11 outputs "epd" records containing exactly 11 pieces. The output file "outP.epd" contains the "epd" records whose pieces are in the user-specified number range. "excludeP.epd" contains the remaining records. "manifest-pc" contains the output records of "outP.epd" followed by their line number and number of pieces. Syntax: epdPieces filname.epd min_pieces max_pieces Examples: epdPieces alpha.epd 10 10 epdPieces alpha.epd 8 11 Output: outP.epd, excludeP.epd, manifest-pc =========================(21) epdPly / pgnPly ===================== "epdPly" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the position after a user- supplied number of plies. "PGN-Extract" outputs an intermediate "epd" file, "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. Then "epdPly" inputs "temp.epd" and a user-specified ply number. Then for each game, the "epd" record for the position after the user-specified ply is output to "outY.epd". If the record does not exist, a blank line is output. For example, if 0 is the user-specified ply number, the first record of each game is output. For example, if 24 is the user-specified ply number, the 25th record of each game, if it exists, is output. If it does not exist, a blank line is output. Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdPly temp.epd ply_number Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdPly temp.epd 30 Output: outY.epd When using "epdPly" by itself: Syntax: epdPly filename.epd ply_number Output: outY.epd Comments: 1. Because "epdPly" requires two tools to be executed consecutively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnPly.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdPly temp.epd %2 Syntax: pgnPly filename.pgn ply_number Example: pgnPly beta.pgn 30 2. "pgnPly.cmd" is included in the "40H-EPD" download. 3. The output file "outY.epd" can be used by "epdOccur" to see which games have the same position after the user-specified ply. In this case, you should number comment each game using "gameNum" from "40H-PGN". 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(22) epdPosition / pgnPosition =========== "epdPosition" / "pgnPosition" extracts records based on a user- supplied board position such as 1R6/r1p5/k1K5/8/1P6/4b3/8/8. The extracted records board position matches the user-supplied board position. The structure of the board position is the same as the first field of an "epd" record. The board position DOES NOT CONTAIN the side to move, castling permissions, or "en passant" permission. "epdPosition" can also be used in combination with "PGN-Extract" by David Barnes. The output files of "epdPosition" are "outS.epd", "manifest-ps", and "numbers". "outS.epd" extracts the records that match the board-position. "manifest-ps" outputs the user-specified board position and the "epd" output records. "manifest-ps" also lists the line numbers of the "epd" records (for "epd" input), and the game and ply numbers (for "pgn" input). "numbers" lists the game numbers of the output games, and is only needed when "PGN-Extract" is used. "numbers" is an input file for "numExtract". ------- When using "epdPosition" by itself: Syntax: epdPosition filename.epd board_position Example: epdPosition alpha.epd 1R6/r1p5/k1K5/8/1P6/4b3/8/8 Output: outS.epd, manifest-ps, numbers ------- To extract games from a "pgn" file that contains a position with the user-specified board position", you would execute these commands: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdPositionk temp.epd board_position numExtract numbers filename.pgn "PGN-Extract" outputs "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. "epdPosition" inputs "temp.epd" and the user-specified board positon. Then each "epd" record containing the board position is output to "outS.epd". Also, the game numbers are output to "numbers". "numExtract" inputs the original "pgn" file and the file "numbers" and extracts those games having a position matching the board position. No game is output more than once. ------- When using "epdPosition" in combination with "PGN-Extract": Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdPosition temp.epd board_position numExtract numbers filename.pgn Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdPosition temp.epd 1R6/r1p5/k1K5/8/1P6/4b3/8/8 numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn, outS.epd, manifest-ps, numbers The output file "outZ.pgn" contains those games having a position matching the user-specified board position. ------- Comments: 1. Since You need to use several steps to extract "pgn" games containing a position with the user-specified board structure, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnPositionMask.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdPosition temp.epd %2 numExtract numbers %1 Syntax: pgnPosition filename.pgn board_position Example: pgnPosition beta.pgn 1R6/r1p5/k1K5/8/1P6/4b3/8/8 Output: outZ.pgn, excludeZ.pgn, outS.epd, manifest-ps, numbers 3. "pgnPosition.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(23) epdRandom =========================== "epdRandom" randomly rearranges the lines of the input file. The user has the option to only output a user-specified number of lines. The output file is "outRN.epd". The default is to output all the lines. In this case, do not use a value for the parameter "num_lines". If the same line is output more than once, it is because that line was repeated in the input file. Syntax: epdRandom epdfile_name [num_lines] Usage: epdRandom alpha.epd epdRandom alpha.epd 120 Output: outRN.epd Comments: 1. Do not use a value for the parameter "num_lines" if you want to output all the lines of the input file. 2. "epdRandom" uses the random number generator from "Java 8". =========================(24) epdRemove =========================== "epdRemove" removes a user-specified opcode from the "epd" input file and saves the removed opcode and its values in a second output file. "epdRemove" outputs the file "outR.epd". "outR.epd" does not contain the specified opcode. "rlist" lists the removed opcode and its values. This tool could also be called "epdOpcodeExtract" because of "rlist". The user can only specify one opcode per execution. If there are multiple instances of the same opcode in a record, all instances will be removed. If you want to use "epdRemove" again, you have to first rename the output file before using it as input file. Use "epdTrim" if you want to remove all opcodes. Blank lines are retained in both "outR.epd" and "rlist". Syntax: epdRemove filename.epd opcode_type Example: epdRemove alpha.epd id Output: outR.epd, rlist =========================(25) epdSingle =========================== "epdSingle" removes records containing the same position as a previous record in the "epd" file. An "epd" position consists of the first 4 tokens of an "epd" record. Positions having the same "piece placement", "side to move", "castling rights' and "en passant" target square values, are considered to have the same position. All four items must agree. A record with a faux "en passant" target square will have a different position compared to the same record not having a target square. Therefore remove faux "en passant" target squares before using "epdOccur". The tool "epdFaux" can be used to remove them. Records with the same position can have different opcodes. The output file is outA.epd. The removed records are saved in excludeA.epd. The output records remain in the original order. Opcodes are not changed. Blank lines are removed. Syntax: epdFaux filename.epd epdSingle outFaux.epd Output: outA.epd, excludeA.epd Comments: 1. Removing faux "en passant" target squares from the "epd" input file is necessary to ensure full and accurate output from "epdSingle". This can be accomplished by first using "epdFaux". 2. "txtSingle" removes duplicates of full lines whereas "epdSingle" removes lines containing the same position as a previous record. "epdSingle" ignores opcodes. 3. "epdSingle" ignores blank lines and does not output blank lines. =========================(26) epdSort ============================= "epdSort" sorts the lines of an "epd" file alphanumerically. It sorts in either ascending or descending order. Default is ascending order. Use the optional parameter "down" for descending order. Blank lines are deleted in the output file. Syntax: epdSort filename.epd [down] Examples: epdSort alpha.epd epdSort alpha.epd down Output: outSort.epd Comments: 1. "epdSort" can be used with any "txt" file. 2. "down" is case-sensitive. =========================(27) epdToken ============================ "epdToken" uses a user-specified "token_number" to output the token (field) in that position on each line. If that token does not exist, a blank line is output. For example: epdToken alpha.txt 5 will output the 5th token of each line if it exists, otherwise it will output a blank line. The user can also output a continuous range of tokens. For example: epdToken alpha txt 2 4 will output the 2nd, 3rd and 4th token on each line, if they exist. If none exist, it will output a blank line. The output file is tklist. Syntax: epdToken filename.txt token_number [last_token_number] Usage: epdToken alpha.epd 3 epdToken alpha.epd 2 5 Output: tklist =========================(28) epdTrim ============================= "epdTrim" removes all opcodes from the "epd" input file, and saves the removed opcodes two different ways - horizontally and vertically. There are 3 output files. The output file "outT.epd" does not have any opcodes. Blank lines are retained. The output file "hlist" lists the removed opcode values horizontally as they appear in the input file. Blank lines are retained. Sample output in "hlist": bm f5; id "Undermine.001"; c0 "f5=10, Be5+=2, Bf2=3, Bg4=2" bm c5; id "Undermine.002"; c0 "c5=10, Qd4+=4, b5=4, g4=3"; The output file "vlist" lists the line number and then each removed opcode value on a separate line. Sample output in "vlist": line # 1 bm f5 id "Undermine.001" c0 "f5=10, Be5+=2, Bf2=3, Bg4=2" line # 2 bm c5 id "Undermine.002" c0 "c5=10, Qd4+=4, b5=4, g4=3" Syntax: epdTrim filename.epd Example: epdTrim alpha.epd Output: outT.epd, hlist, vlist Comments: 1. Use "epdRemove" to remove one opcode or to list the values of one opcode. =========================(29) epdTriple ====================== "epdTriple" extracts records where there are 3+ pawns of the same color in the same column. Another output file contains records with 4+ pawns of the same color in the same column. "outTP.epd" contains the records having 3+ pawns of the same color in the same column. Records not extracted are in "excludeTP.epd". "outQP.epd" contains the records having 4+ pawns of the same color in the same column. Records not extracted are in "excludeQP.epd". Syntax: epdTriple filename.epd Example: epdTriple alpha.epd Output: outTP.epd, excludeTP.epd, outQP, excludeQP.epd Comment: 1. Either side can be the side to move. 2. A record in "outQP" will also be in "outTP". =========================(30) idOpcode ============================ "idOpcode" outputs a file of "id" opcodes, each containing an ID number. The default starting ID number is "1". Succeeding numbers increase by 1. The user can optionally specify a different starting number. The starting ID number can be changed by listing it at the end of the command line. For example: idOpcode alpha.txt 31 will start with "31" as the first ID number. Sample output in output file "idlist": id "1"; id "2"; id "3"; The user can attach the opcodes in "idlist" to the records of the "epd" file by using "epdInsert". For example: idOpcode beta.txt copy idlist inlist epdInsert inlist beta.txt Sample output line: r1bq1rk1/pp2bppp/2n2n2/2pp2B1/3P4/2N2NP1/PP2PPBP/R2Q1RK1 b - - id "4"; Syntax: idOpcode filename.epd [startNum] Examples: idOpcode alpha.epd idOpcode alpha.epd 51 Output: idlist =========================(31) smOpcode ============================ "smOpcode" outputs supplied move ("sm") opcodes from a "pgn" file. Each "sm" opcode corresponds to an actual move in the "pgn" file. For example: sm 14W Qxh7; indicates that White's 14th move is Qxh7. "smOpcode" inputs a "pgn" file and outputs opcodes for attachment in an "epd" file. The output file "smlist" lists the "sm" opcodes one per line. If there are multiple games in the "pgn" file, then "smlist" separates the "sm" opcodes of successive games by inserting two blank lines. The reason for the extra blank line is that no move is played from the final position. The "sm" opcodes can be appended to the "epd" records of the positions where the moves were played. An "epd" file can be created from an input "pgn" file using "PGN-Extract" by David Barnes. Then "epdInsert" can be used to append the "sm" opcodes to the corresponding records of the "epd" file. To append the "sm" opcodes to the "epd" file from "PGN-Extract": 1. PGN-Extract -Wepd --nofauxep -s -obeta.epd beta.pgn 2. smOpcode beta.pgn 3. << User checks smlist against beta.epd >> 4. copy smlist inlist 5. epdInsert inlist beta.epd 6. << Output is outN.epd >> Usage: smOpcode beta.pgn Output: smlist =========================(32) txtChar ============================= "txtChar" is useful in determining the presence of unexpected "control" or "extended" characters in a text file. Such characters may cause issues with processing the text file. "txtChar" outputs the location and ASCII value of "control" characters (ASCII < 32) and "extended" characters (ASCII > 127), with the exception of "carriage return" (CR) and "line feed" (LF). CR and LF are present at the end of each line except sometimes missing in the last line. Also omitted is an "end of file" (EOF) marker. The EOF marker is variable and depends on the operating system or programming environment, and is sometimes missing. The Hex editor "XVI32" and various text editors can be used to edit the file for "control" or "extended" characters. Note that "XVI32" uses hexadecimal numbers instead of decimal numbers. There are many versions of ASCII "extended" characters. Character output varies by versions. The output file "charList" gives the location of the "control" and "extended" characters (except CR, LF and EOF). Syntax: txtChar filename.txt Usage: txtChar alpha.txt Output: charList Comments: 1. The input file is NOT required to have a "txt" extension. =========================(33) txtColumn =========================== "txtColumn" outputs a contiguous range of columns in a text file. It uses user-specified starting and ending column numbers. The user-specified column numbers must be from 1 to 1000. The column numbers are inclusive. There are 2 output files, "outT.txt" and "excludeT.txt". "outT.txt" contains the extracted columns. "excludeT.txt" contains the columns not in "outT.txt". These columns are condensed and realigned. For example: txtColumn alpha.txt 1 20 outputs columns 1 to 20, inclusive, to "outT.txt". The remaining data, columns 21 to the end, is output to "excludeT.txt". txtColumn alpha.txt 15 30 outputs columns 15 to 30, inclusive, to "outT.txt". The remaining data, columns 1 to 14, and columns 31 to the end, is output to "excludeT.txt". "txtColumn" can be used to truncate the lines of a text file. txtColumn alpha.txt 1 40 truncates each line after column 40. Syntax: txtColumn filename.txt starting_column ending_column Example: txtColumn alpha.txt 21 60 Output: outT.txt, excludeT.txt Comments: 1. The input file is NOT required to have a "txt" extension. 2. The input file cannot contain a "tab" character. Only tabs created by multiple spaces are acceptable. =========================(34) txtOccur ============================ "txtOccur" lists the number of occurrences of each distinct non-blank line. It also lists the line numbers where the lines occurred. The output lines in "outL.txt" contain the line from the "txt" file, followed by the comment indicator "c0", followed by the number of occurrences and followed by ";". Sample output line from "outL.txt": Have a nice day! c0 10; The output lines in "outL2.txt" contain what "outL.txt" contains, followed by the comment indicator "c1", followed by the line numbers of the occurrences and followed by ";". Sample output line from "outL2.txt": Have a nice day! c0 10; c1 8 9 10 34 35 38 44 45 46 47; Syntax: txtOccur filename.txt Examples: txtOccur alpha.txt Output: outL.txt, outL2.txt Comments: 1. "txtOccur" lists the occurrences of full lines whereas "epdOccur" lists the occurrences of records containing the same position. 2. "txtOccur" ignores blank lines and does not output blank lines. =========================(35) txtSingle =========================== "txtSingle" removes any line that is a duplicate of a prior line. The remaining lines are in their original order. The prior line does NOT have to be immediately prior. It can be any line previous to the current line. The output file is outA.txt. The removed records are saved in excludeA.txt Syntax: txtSingle filename.txt Example: txtSingle alpha.txt Output: outA.txt, excludeA.txt Comments: 1. The input file is NOT required to have a "txt" extension. 2. "txtSingle" removes duplicates of full lines whereas "epdSingle" removes lines containing the same position as a previous record. 3. "txtSingle" ignores blank lines and does not output blank lines. ==================================================================== ====================================================================