"readme-EPD" for "40H-EPD-2024A" Tools & Utilities "40H-EPD" is a collection of 35 "EPD/TXT" command-line utility tools by Norman Pollock (USA) for use in "Windows" or in a "Java" environment. "40H-EPD" tools and instructions are copyright (c) 2010-2024 by Norman Pollock. All rights reserved. "40H-EPD" can be freely distributed and used for non-commercial purposes as long as "40H-EPD" is kept intact and no changes are made to any files. Any commercial distribution or use of "40H-EPD", or any of its files, is strictly forbidden. Disclaimer: The "40H-EPD" package is distributed "as is". No warranty or guarantee of any kind is expressed or implied. The user assumes all risks of usage. The author is not responsible for any damage or losses of any kind caused by the use or misuse of the tools or this "readme-EPD" file. Current download sites for "40H Tools": 40hchess.epizy.com nk-qy.info/40h (Thanks to Frank Quisinsky) Please bookmark. Contact: Norman Pollock rc1242@yahoo.com ==================================================================== FAQ: 1. What is the purpose of "40H" tools? "40H" tools enable users to do specific chess tasks. 2. What computer skills are needed to use the "40H" tools? The user has to be able to use command-line commands in a "Command Prompt" window. 3. Where are the full usage instructions for each "40H-EPD" tool? Please read the full usage instructions before using a tool. FULL USAGE INSTRUCTIONS are the last section of this readme. A quick way to go there is by using a text search (ctrl+F). 4. What does the term "40H" have to do with chess? 40H is a hexadecimal and computer science number that is equal to the number of squares in a chessboard. 40H = 64 decimal. 5. Do the "40H" tools require Internet access? Only to download the tools. 6. On which platforms will the "40H" tools work? "40H" tools come in 2 formats: "Windows" executables (".exe") and "Java Class Files". Download "40H-Java Class Files" for non-Windows platforms. 7. Are the "40H" tools portable? Yes. You can even save them on a flash drive and use them on different PCs. They are single files and no "dlls" are required. They do not require a "setup" program and they do not affect the registry. They can be removed by simple deletion. 8. Are the "40H" tools 64-bit and do they use multiprocessing? They are each 32-bit tools and they do NOT use multiprocessing. 9. Are the "40H" downloads checked for viruses/malware? "40H" downloads are checked at www.virustotal.com 10. Is the "readme" file included in the download? Yes, and it is also on the website. The version on the website has the latest updates and corrections. ==================================================================== OVERVIEW OF TOOLS (FULL INSTRUCTIONS AT THE END OF THIS FILE) Each "40H-EPD" command-line tool consists of a single file. Each tool executes from a command-line in a "Command Prompt" window. The "40H-EPD" tools do not change any input file. Output appears in a new file(s). There are 35 tools and 7 batch scripts. Each batch script is paired with one of the tools. 1. "bmOpcode" is used in combination with the chess GUI "Arena 3.5.1" by Martin Blume, and a computer engine for analysis. Together they output "best move" ("bm"), "centipawn evaluation" ("ce"), and "predicted variation" ("pv") opcodes. 2. "epd3fold" / "pgn3fold" is used in combination with the utility tool "PGN-Extract" by David Barnes and the "40H-PGN" tool "numExtract". Together they input a "pgn" file and output those games having a 3-fold position repetition. 3. "epdColor" separates the records based on whether White or Black is the active color (the color that makes the next move). 4. "epdConvert" produces a "pgn" file from an "epd" file. Each "epd" record is put into a "pgn" game shell having a "FEN" tag. 5. "epdDifference" extracts records based on a user-specified simple material difference range between the pieces of the two colors. The user can use the default piece values, or specify custom values. The simple material difference of each record is listed in a separate file. 6. "epdEnPass" removes faux "en passant" target squares from an "epd" file. An "en passant" target square is faux when the side to move does NOT have a pawn in position to capture the target square. 7. "epdExtra" separates the records of the input "epd" file based on the number of "extra" promoted pieces. An extra promoted piece is a 2nd or more Queen, a 3rd or more Rook, Bishop, or Knight. 8. "epdFin" / "pgnFin" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the final position. 9. "epdFlip" reverses the colors and does a vertical reflection about the imaginary horizontal line between ranks "4" and "5". The color to move, "castling" rights and "en passant" rights are also reversed. The new "epd" record is logically equivalent to the old "epd" record. 10. "epdImbalance" separates the records into two files. One for records where the opposing sides have different material (imbalanced), and the other for records where the opposing sides have the same material (balanced). 11. "epdInsert" appends new opcodes to the records. 12. "epdInsuff / pgnInsuff" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they extract drawn games that end with insufficient checkmating material. 13. "epdKings" extracts records where the Kings are advanced beyond each other. In other words, a record is extracted if the White King's rank (row) is higher than the Black King's rank (row). 14. "epdMask" extracts records based on a minimal user-specified "pieces/location structure" such as "2kr4/8/8/8/8/8/8/2KR4". 15. "epdMaterial" extracts records based on a user-specified file "pieces" that lists specific pieces and their specific quantities. 16. "epdOccur" lists the number of occurrences of each distinct position in the input "epd" file. It also lists the line numbers where the positions occurred. 17. "epdOrder" sorts the records in descending order based on the "centipawn evaluation" ("ce" opcode) of the position from White's perspective. 18. "epdPawnDifference" separates records based on the difference in the number of pawns of the two sides. 19. "epdPieces" extracts records based on a user-specified number range for the number of chess pieces on the board. 20. "epdPly" / "pgnPly" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the position after a user-specified number of plies. 21. "epdPosition" inputs a user-specified "epd" search file and an "epd" input data file, then outputs records in the input data file that match the first and second fields of a position in the search file. 22. "epdRemove" removes a user-specified opcode from the "epd" input file and saves the removed opcode and its values in a second output file. 23. "epdSingle" removes records containing the same position as a previous record in the "epd" file. The removed records are saved. 30. "epdSort" sorts the records of an "epd" file aphanumerically based on the first four fields of each record. It sorts in ascending or descending order. 24. "epdTrim" removes all opcodes from the "epd" input file, and saves the removed opcodes two different ways - horizontally and vertically. 25. "epdTriplePawns" extracts records where there are 3 or more pawns of the same color in the same column. 26. "idOpcode" outputs a file of "id" opcodes, each containing an ID number. 27. "smOpcode" outputs supplied move ("sm") opcodes from a "pgn" file. Each "sm" opcode corresponds to an actual move in the "pgn" file. 28. "txtColumn" outputs a contiguous range of columns in a text file. It uses user-specified starting and ending column numbers. 29. "txtMerge" joins 2, 3, 4 or 5 "txt" files by adding one line from each successive input file, then repeating. 30. "txtOccur" lists the number of occurrences of each distinct non-blank line. It also lists the line numbers where the lines occurred. 31. "txtSingle" removes any line that is a duplicate of a prior line. The remaining lines are in their original order. 32. "txtSort" sorts the lines of a text file alphanumerically. It sorts in either ascending or descending order. 33. "txtSplit" splits a large text file into as many as five separate text files. Splitting can be repeated. 34. "txtZero" aligns a column of integers in a text file by inserting leading spaces or leading zeros. ==================================================================== "epd" AND "txt" PREFIXES The "epd" prefix indicates that the tool should only be used on "epd" files. The "txt" prefix indicates that the tool can be used on all text files, including "epd" files. ==================================================================== BATCH/SCRIPT PGN TOOLS USing A 40H-EPD TOOL pgn3fold.cmd: extracts "pgn" games containing a 3-fold repetition of a position. Uses "pgn-Extract", "epd3fold" and "numExtract". pgnFin.cmd: lists the final position of each "pgn" game in "epd" format. Uses "pgn-Extract" and "epdFin". pgnInsuff.cmd: extracts games ending in a position with insufficient material for a checkmate. Uses "pgn-Extract", "epdInsuff" and "numExtract". pgnMask.cmd: extracts "pgn" games containing a position with a user-specified "position/location" structure in "epd" format. Uses "pgn-Extract", "epdMask" and "numExtract". pgnMaterial.cmd: extracts "pgn" games containing a position with piece/ quantity values from the user-specified file "pieces". Uses "pgn-Extract", "epdMaterial" and "numExtract". pgnPly.cmd: lists the position in each "pgn" game that occurred after a user-specified ply. Outputs in "epd" format. Uses "pgn-Extract" and "epdPly". pgnPosition.cmd: extracts "pgn" games containing a position in a user-specified file of "epd" positions. Uses "pgn-Extract", "epdPosition" and "numExtract". Full instructions are available in the instructions for the related "40H-EPD" tool. The 7 batch script "pgn" tools are included in the "40H-EPD" download. All files executed within the batch script must either be in the Working Folder or on the System Path. An input data file must be in the Working Folder or be specified with a pathname. =================================================================== INTRODUCTION: Download the "40H-EPD" compressed file. It is packed in "7-zip" format. It can be unpacked using "7-zip", available at: http://www.7-zip.org Unpacking the "40H-EPD" download file results in 35 "Windows" executable ("exe") files, 7 batch script ("cmd") files, and 1 readme file. The "readme-EPD" file is oriented to users of the "Windows" executable files. Users of the "Java" class files have to make adjustments for use in a Java Environment. Each "40H-EPD" tool is a "command-line tool", which means that it executes on a command-line in a "Command Prompt" window. The "40H-EPD" tools were written in "Java" and compiled using "gcj 3.4". All coding is original. Each "40H-EPD" tool consists of a single self-contained file. No external "dlls" are required. Each "40H-EPD" tool is portable and just has to be copied to be installed. It does not need a setup program and it does not write any data to the registry. Unless otherwise indicated, each "40H-EPD" tool inputs an "epd" file. "40H-EPD" tools DO NOT MAKE ANY CHANGES to the input files. Output appears in a new file(s). Output files are pre-named. Users should rename the output file(s) before they are overwritten by the next execution of the tool. "40-EPD" tools assume the records in the "epd" input files are written in Standard Algebraic Notation ("SAN") and adhere to the "EPD Standards" in the "Standard PGN Specification Guide". "EPD Standards" are in the "Standard PGN Specification Guide" at: http://www.saremba.de/chessgml/standards/pgn/pgn-complete.htm and also at http://jchecs.free.fr/pdf/EPDSpecification.pdf The 35 "40H-EPD" tools are: "bmOpcode", "epd3fold", "epdColor", "epdConvert", "epdDifference", "epdEnPass", "epdExtra", "epdFin", "epdFlip", "epdImbalance", "epdInsert", "epdInsuff", "epdKings", "epdMask", "epdMaterial", "epdOccur", "epdOrder", "epdPawnDifference", "epdPieces", "epdPly", "epdPosition", "epdRemove", "epdSingle", "epdSort", "epdTrim", "epdTriplePawns", "idOpcode", "smOpcode", "txtColumn", "txtMerge", "txtOccur", "txtSingle", "txtSort", "txtSplit", and "txtZero". The 7 batch/script "pgn" tools are: "pgn3fold", "pgnFin", "pgnInsuff", "pgnMask", "pgnMaterial", "pgnPly" and "pgnPosition". Thanks to Jim Ablett for helping me to get started on this project and for showing me how to compile "Windows" executables. ==================================================================== LIMITATIONS: "40H-EPD" tools ONLY execute from a command-line within a Command Prompt. See http://dosprompt.info/ for Command Prompt support. "40H-EPD" tools are subject to the capacities of the maximum array sizes that are specified in their coding. These maximum array sizes are adequate for most "epd" input files. However, if you use a very large "epd" input file, there is a possibility that an overflow error will occur or that execution will take an extended amount of time. "40H-EPD" tools might not perform properly when processing an input "epd" file that is extremely fragmented. Be sure to defragment your hard drive regularly. "40H-EPD" tools might not perform properly when processing an input "epd" file that does NOT completely conform to "EPD" Standards". Some "40H-EPD" tools may take a very long time to process very large "epd" files. You could check the "Task Manager" if you think the tool has stalled. Newer versions of external tools mentioned in this document may or may not work properly with "40H" tools. In addition, the availability to download those tools may change. For example, a web site may change to a new address, or even discontinue. Any such event could affect "40H" tools. It is therefore suggested that you continue to keep a copy of any (old) version of an external tool that works properly with "40H" tools. ==================================================================== INSTALLATION, FILES, AND FOLDERS: Create a folder named "40H" if it does not already exist. Extract the download into the "40H" folder. The extraction will unpack the 35 tools into a subfolder named "40H-EPD-2024A". The tools can then be copied/moved to any other folder, preferably one that is already on the System Path. For users who only will be using the "40H-EPD" tools occasionally, the simplest arrangement is to copy the desired tool to the folder where the input file(s) are located and then run in that folder. Likewise, you could copy the input file(s) to the folder where the "40H-EPD" tools are located and then run in that folder. For users who will be using "40H-EPD" tools often, copy/move the "40H-EPD" tools to a folder that is already on the System Path. (Type "path" in a command window to see the System Path.) This way you can run any of the "40H-EPD" tools from any folder. If you do not have such a folder on your System Path, you would first have to create the folder and then edit the Path Variable. Use "search" in "Settings" to find "System Environment Variables". Then click on "Environment Variables", then "Path" in "System Variables", and then "edit". This may vary depending on "Windows" version. Input files are not changed. For extra safety, the user should save all input "pgn" files on another storage medium. Running a "40H-EPD" tool without mentioning all required input files and parameters will list the version number of the tool, the syntax, an example of usage, and the names of the output files. The tools and the input files have to interact. The following 4 items summarize possible arrangements: (1) If the tool and input file(s) are in the same folder, and you are working in that folder, you are good. (2) If the tool is in a folder on the System Path, and you are working in the folder containing the input file(s), you are good. (3) It is best to work in the folder containing the input file(s). This folder is referred to as the "Working Folder". It is also the folder where output files will be created. (4) A pathname is needed for an input file that is NOT in the Working Folder. Output appears in a new file(s). An output file cannot be used as input to the "40H" tool that produced it, unless its filename is changed. Output files are TEMPORARY files. They are created in the Working Folder. Be sure to rename/copy/move any output files that you want to keep. The next execution of the tool in that folder will overwrite the previous output file(s). Do not change an original output file to "read-only" as that will prevent the creating tool from executing in that folder. Many "40H-EPD" output filenames are in the form "out*.epd". Many times the records NOT extracted to "out*.epd" are output to files whose filenames are in the form "exclude*.epd". Sometimes the user is more interested in "exclude*.epd" than in "out*.epd". "40H-EPD" tools accept "UTF-8" encoded "epd" files. However, all "40H-EPD" output files are "Latin-1" encoded. "Latin-1" is the "PGN Standard". ==================================================================== EXECUTION: Each tool executes from a command-line in a "Command Prompt" window. The general format for running a tool is: tool_name [data_filename] epd_filename [parameter(s)] Examples: 1. epdImbalance alpha.epd 2. epdInsert inlist alpha.epd (uses data file) 3. idOpcode alpha.epd Openings (uses parameter) After entering the proper command-line, and making sure all files are accessible, press to start execution. Some tools, like "epd3fold" and "epdOccur" take more time to execute large files compared to other "40H-EPD" tools. Output is in a new file(s) in the Working Folder. The original input file is not changed. Be sure to follow the specific instructions for each tool. ==================================================================== EXTERNAL TOOLS: Future updates of external tools might or might not work properly with "40H-EPD" tools. It is recommended to always retain a working version. Users need a "PGN Viewer" to see a game or a position. The "PGN Viewer" could also be a GUI (graphical user interface). "PGN-Extract" by David Barnes is a PGN/EPD/FEN utility tool that performs many functions. "PGN-Extract" is free to download. The current download site is: http://www.cs.kent.ac.uk/people/staff/djb/extract.html "Arena" (version 3.5.1) by Martin Blume is a UCI/Winboard Graphical User Interface. "Arena" is free to download. Its current download site is: http://www.playwitharena.com/ ==================================================================== BATCH SCRIPTS ("cmd" or "bat" files) Using batch scripts can greatly increase the convenience of the "40H-EPD" suite. It can be used to string together several tools. All files mentioned within the batch script should be in the Working Folder or on the Path. Assuming the following is saved as "pgn3fold.cmd": PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epd3fold temp.epd numExtract numbers %1 Usage: pgn3fold alpha.epd Output: manifest-3f, outZ.pgn ==================================================================== OPCODES The following tools add or remove opcodes from all records in an "epd" file: epdInsert: adds a single opcode epdRemove : removes a single opcode epdTrim : removes all opcodes ==================================================================== ==================================================================== FULL USAGE INSTRUCTIONS: =========================(1) bmOpcode ============================== "bmOpcode" is used in combination with the chess GUI "Arena 3.5.1" by Martin Blume, and a computer engine for analysis. Together they output "best move" ("bm"), "centipawn evaluation" ("ce"), and "predicted variation" ("pv") opcodes. By using "epdInsert", and the output file "bmlist", "bm" and "ce" opcodes can be appended to the "epd" input file. By using "epdInsert", and the output file "pvlist", "pv" opcodes can be appended to the "epd" input file. "bmlist" and "pvlist" do not contain blank lines. If the "epd" input file contains blank line(s), the user will have to use a text editor to insert blank line(s) into "bmlist" and "pvlist" to coordinate with the "epd" file. In "Arena 3.5.1", use "Automatic Analysis" under "Engines". The engine you are going to use has to be "loaded", and "configured" if necessary. Only use one computer engine. ------ Configuration for "Arena 3.5.1": Within "Engine/Automatic Analysis/Source" set "Direction" to "forward". Within "Engines/Automatic Analysis/Engines" choose your engine and set the "Level". Within "Engine/Automatic Analysis/Output": a) Check that the output file name is "Analyses.log". Previous versions of "Arena" and "bmOpcode" used a different filename. "bmOpcode" looks for a file named "Analyses.log". b) set "Protocol file" to "Overwrite". Within "Engine/Automatic Analysis/Options" set Analysis Lines Minimum search depth to "1". Within "Options/Appearance/OtherSettings/Chess": set "Values always from white's point of view." This is to conform to "PGN Standards". ----- "bmOpcode" will input "Analyses.log" and create output files "bmlist" and "pvlist". "bmlist" contains "bm" and "ce" opcodes. "pvlist" contains "pv" opcodes. Line numbers in "bmlist" and "pvlist" match the corresponding record line numbers in the "epd" input file. "bmOpcode" only outputs one "bm" value per position. Additional "bm" values can be added by using a text editor. "bmOpcode" is dependent upon the engine producing a large amount of descriptive output. Since engines differ, "bmOpcode" performs better with some engines than with others. A value of "..." for a "bm", "ce" or "pv" opcode indicates that "bmOpcode" was not able to process that record. Sometimes running "bmOpcode" again will be successful. But there will be times when "bmOpcode" will not be able to be successful due to unusual output in "Analyses.log". To append the "bm" and "ce" opcodes in "bmlist", or the "pv" opcodes in "pvlist" to the "epd" input file: 1. << Arena outputs analysis.log from beta.epd >> 2. bmOpcode Analyses.log 3. << User checks bmlist against beta.epd >> 4. copy bmlist inlist OR copy pvlist inlist 5. epdInsert inlist beta.epd 6. << Output is outN.epd >> "Analyses.log" must be located in the Working Folder and cannot be referenced using a pathname. "bmOpcode" ONLY outputs one "bm" value, although occasionally Arena finds more than 1 "bm" (best move) for a position. To locate such additional "bm" moves, search "analyses.log" and search for "solutions" with the "s" at the end. Then using a text editor, add the additional "bm" moves into "bmlist". Do not insert any commas or other punctuation. For example: Suppose the original bmlist had the line: bm g2-g3; ce +M2; and suppose there is another equally good "bm" (best move): Rg6-g3. Then you would manually insert it into bmlist with a text editor: bm g2-g3 Rg6-g3; ce +M2; Syntax: bmOpcode Analyses.log Example: bmOpcode Analyses.log Output: bmlist, pvlist Comments: 1. A "ce" opcode gives a relative evaluation of a position, from the White point of view, based on "centipawns" which are 1/100 of a pawn. This is different, by a factor of 100, from the common evaluation based on one pawn. For example, a common evaluation of "+1.35" pawns is equivalent to a "ce" evaluation of "135". 2. A "pv" opcode gives a variation of what might happen with maximum play by both colors, starting with the "best move". It is the variation that the computer engine believes to be the best series moves for each side, at the time the engine stopped analyzing. 3. "bm", "ce" and "pv" values usually change as a computer engine continues to analyze a position. 4. Older versions of Arena (versions 1 and 2) used the spelling "Analysis.log" for the output log file. You would have to change the name to "Analyses.log" for use with "bmOpcode". =====================(2) epd3fold / pgn3fold ======================= "epd3fold" is used in combination with the utility tool "PGN-Extract" by David Barnes and the "40H-PGN" tool "numExtract". Together they input a "pgn" file and output those games having a 3-fold position repetition. A position is "repeated" if all pieces of the same kind and color are on identical squares, and all possible moves are the same. For computer processing, "epd3fold" interprets this to mean that the first four tokens in their respective "epd" records are identical. Although the two interpretations are ever so slightly different, the possible difference in output is microscopically small. For example, if one position has a "castling" (or "en passant") permission that is temporarily not executable, it will have the same possible moves as a similar position that only differs by not having that "castling" (or "en passant") permission. A 3-fold position repetition is very hard for humans to notice if the positions do not occur in successive moves. When using "epd3fold" you must precede it with either "PGN-Extract" using its "--nofauxep" parameter or "epdEnPass". This eliminates "faux" "en passant" notations. A "faux" "en passant" notation could cause "epd3fold" to incorrectly evaluate two positions having the same possible moves. (See example below) "PGN-Extract", with the "-Wepd" and "--nofauxep" parameters, precedes "epd3fold" in the sequence of execution. It produces an "epd" file from a "pgn" file and also removes "faux" "en passant" notations. "epd3fold" then produces the game numbers of games having a 3-fold repetition in the "epd" file from the previous step. Those numbers are listed in the output file "numbers". Also, repeated positions, the game numbers and numbers of occurrences are listed in the output file "manifest-3f". "numExtract" follows "epd3fold" in the sequence of execution. It uses the file "numbers" to produce the output file "outZ.pgn". "outZ.pgn" contains the "pgn" games having a 3-fold repetition. No game is listed more than once even if it has more than one 3-fold position repetition. To demonstrate what could happen if "faux" "en passant" notations are NOT removed before using "epd3fold", consider the following game fragment: 1. e4 e5 2. Nf3 Nc6 3. Ng1 Nb8 The positions after move #1 and #3 are identical. However, afer move #1, the "epd" has a "faux" target square "e6". rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq e6 rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq - This difference would cause "epd3fold" to INCORRECTLY evaluate the two positions as NOT being the same. But the fact is that all their pieces are of the same kind and color, are on identical squares, and all possible moves are the same. Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epd3fold temp.epd numExtract numbers filename.pgn Output: manifest-3f, outZ.pgn Comments: 1. Because "epd3fold" with "PGN-Extract" requires three tools to be executed successively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgn3fold.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epd3fold temp.epd numExtract numbers %1 Syntax: pgn3fold filename.pgn Example: pgn3fold beta.pgn 2. "pgn3fold.cmd" is included in the "40H-EPD" download. 3. Using the tool "gameNum" on the input "pgn" file before using "pgn3fold" is recommended. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(3) epdColor ============================== "epdColor" separates the records based on whether White or Black is the active color (the color that makes the next move). "outW.epd" contains the records having White move next and "outB.epd" contains the records having Black move next. Syntax: epdColor filename.epd Example: epdColor alpha.epd Output: outW.epd, outB.epd Comments: 1. "epdColor" is often used before using other tools in order to keep results separated by active color. =========================(4) epdConvert ============================ "epdConvert" produces a "pgn" file from an "epd" file. Each "epd" record is put into a "pgn" game shell having a "FEN" tag. Since the conversion is from an "epd" file, not a "fen" file, the values for the halfmove clock and the fullmove counter are not available. Therefore the default values of "0" and "1" are used. The output file is "outEPD.pgn". If a best move ("bm") or predicted variation ("pv") opcode is present, the "bm" move or the "pv" moves will be output. If both are present, only the "pv" moves will be output. Usually the first move of the "pv" is the "bm" move. If more than one "bm" value is present, only the first one will be processed. If an "id" opcode is present, it is output as an "ID" tag. Other opcode data is NOT carried forward to the "pgn" file. Each game in the output is presumed to have an undetermined result ("*"). If you suspect that a checkmate or stalemate is possible due to the "pv" or "bm" data, you can use "PGN-extract" with the "--fixresulttags" option to correct the results. For example: pgn-extract --fixresulttags -s -oout.pgn outEPD.pgn The file "out.pgn" will have the correct results for games ending in a checkmate or stalement. "PGN-extract" also checks the legality of the moves and converts LAN to SAN. The formatting in "outEPD.pgn" can be improved using "trim" from "40H-PGN". Syntax: epdConvert filename.epd Example: epdConvert alpha.epd Output: outEPD.pgn Comment: 1. Users who do NOT want to have the "bm" or "pv" moves output, should use "epdRemove" or "epdTrim" before using "epdConvert". 2. A "pgn" with a "FEN" tag is not useful if you are building an opening book. =========================(5) epdDifference ========================= "epdDifference" extracts records based on a user-specified simple material difference range between the pieces of the two colors. The user can use the default piece values, or specify custom values. The simple material difference of each record is listed in a separate file. "Simple material difference" is defined here as the difference between the total of the piece values of the active color minus the total of the piece values of the non-active color. The default piece values are Queen = 9, Rook = 5, Bishop = 3, Knight = 3 and Pawn = 1. The user can specify other values. "epdDifference" uses the perspective of the active color (the color making the next move). So, if the simple material difference is 3, it means that the total simple piece values of the active color exceed that of the non-active color by 3 Pawn units. Likewise, a simple material difference of -3 Pawn units means that the total simple piece value of the non-active color exceeds those of the active color by 3 Pawn units. The default simple material piece values are: Queen = 9 Pawn units. Rook = 5 Pawn units, Bishop = 3 Pawn units, Knight = 3 Pawn units, Pawn = 1 Pawn unit. The King is not given a piece value because each color always has exactly one King, and that balances out. Secondly, any piece value for the King would have to be infinite because of the King's infinite importance. The user has the option to set his own simple material piece values by specifying those values on the command line, after specifying the range. The user must specify the values in the order of Queen, Rook, Bishop and Knight. All 4 values must be specified even if it is a default value. The user does not have to specify a piece value for a Pawn because the Pawn is the unit of comparison. For example, if the user wants the Queen to be worth 10 Pawn units, and the Bishops to be worth 3.5 Pawn Units, and the Rook and Knight to retain their default values, the user would state the following: 10 5 3.5 3 Notice that there is a space between the values but there is no punctuation. The simple material difference is then calculated by taking the total simple piece value of the active color minus the total simple piece value of the non-active color. For example, 3r2q1/5pk1/6p1/6P1/7Q/8/6K1/7R w - - has total simple piece values of: White piece values: 1Q + 1R + 1P = 16 Black piece values: 1q + 1r + 2p = 17 Since White is the active color, the simple material difference = -1. The simple material difference is a very rough estimator of which color has the advantage in the game. It does NOT take into account the many nuances of each position such as threats, weaknesses, mobility, king protection, and impending captures. To run "epdDifference", the user has to specify a RANGE for the simple material difference. The minimum value is followed by the maximum value, but the minimum value can equal the maximum value. These values are inclusive. For example: epdDifference alpha.epd 2 5 will extract records from alpha.epd which have a simple material difference from 2 to 5 Pawn units, inclusive. An example where the user chooses the piece values: epdDifference alpha.epd 2 5 9.5 5 3.1 3 will extract records the same as in the previous example, except that new piece values are specified on the command line in the following order: Queen = 9.5 Pawn units, Rook = 5 Pawn units, Bishop = 3.1 Pawn units, Knight = 3 Pawn Units, The values of all four piece types must be listed if you want to change the piece values. And the order must be Queen, Rook, Bishop, and Knight. Other Usage examples: epdDifference alpha.epd 0 0 will extract records where the simple material difference is 0. Note that this does not mean that the two colors have the same pieces. epdDifference alpha.epd 2 2 The extracted records will have a simple material difference of exactly 2 Pawn units. The total simple material difference of the pieces of the active color exceeds those of the non-active color by exactly 2 Pawn units. epdDifference alpha.epd 2 6 The extracted records will have a simple material difference ranging from 2 Pawn units to 6 Pawn units inclusive. epdDifference alpha.epd -3 1 The extracted records will have a simple material difference ranging from -3 Pawn units to 1 Pawn unit inclusive. epdDifference alpha.epd -5 -2 The extracted records will have a simple material difference ranging from -5 Pawn units to -2 Pawn units inclusive. The output file "outM.epd" contains the records that are in the range. The output file "outSMD.epd" contains a list of all the records along with a new opcode "smd" that specifies the simple material difference for that record. The records are listed in the original input order. Syntax: epdDifference filename.epd min_diff max_diff [Qval Rval Bval Nval] Examples: epdDifference alpha.epd 2 4 epdDifference alpha.epd -3 5 epdDifference alpha.epd 2.1 6 9.5 5 3.3 3.2 Output: outM.epd, outSMD.epd, excludeM.epd Comments: 1. There are many opinions by chess experts concerning the best relative piece valuations. See "Wikipedia" on relative values of chess pieces for further information. 2. The user-specified numbers for the range and the custom material numbers can be decimal numbers. For example: epdDifference alpha.epd 2.3 4.4 9.6 5.2 3.1 2.8 where 2.3 and 4.4 are the range numbers, and 9.6, 5.2, 3.1 and 2.8 are the custom material numbers for a Queen, a Rook, a Bishop and a Knight. =========================(6) epdEnPass ============================= "epdEnPass" removes faux "en passant" target square notations from an "epd" file. An "en passant" target square is created immediately after a pawn moves two ranks from its starting position. The target square is the the square where the pawn would be if it had only moved one rank. An "en passant" target square notation gives the opponent permission, ON HIS NEXT MOVE ONLY, to capture the moved pawn at the target square with one of his pawns. The "en passant" capture move MUST be executed on the first move after an "en passant" target square is created. If it is not used then, the permission expires. For example, when White makes the opening move of "e4", the "en passant" target square notation "e3" is created in the "epd" record. NoteHowever, the opponent does not have a pawn in position to capture at square "e3". "e3" is a "faux" target square. "epdEnPass" will remove it and replace it with "-". Although "epd" notation rules specify there be a target square when a pawn initially moves 2 ranks, these rules do not specify that there MUST be an opponent pawn in position to make a capture at the target square. Most of the time there is NO pawn in position to capture at the target square. If that is the case, the capture is not possible and the target square is spurious or "faux". The writers of "epd" notation did not see any harm in having a "faux" target square. However, with the onset of computer analysis, the "epd" position with a "faux" target square and an "epd" of an otherwise identical position without a target square, are considered to be two different positions and that causes a "matching" problem. That is why it is important to use "epdEnPass" to remove faux target squares caused by there not being a pawn in position to capture. In a DIFFERENT "en passant" scenario, a capture cannot occur if the "en passant" capture would put the capturer's King into check. This could happen when the capturing pawn is pinned to its King. Since the capture would be an "illegal" move, it cannot occur and the target square is "faux". Although the faux target square should be removed, "epdEnPass" DOES NOT REMOVE IT. It should also be noted that this different scenario is extremely rare, while the scenario where there is no pawn in position to capture occurs very often. The output file "outEP.epd" contains the corrected records with "faux" target squares removed, in addition to the other records. The output file "manifest-ep" contains a list of records that were corrected along with their line numbers. Syntax: epdEnPass filename.epd Examples: epdEnPass alpha.epd Output: outEP.epd, manifest-ep Comments: 1. An "en passant" target square capture can only occur on the next move by the opponent after the target square has been created. The target square expires after the opponent's next move. 2. It is possible for two opponent pawns to be in position to make a capture at the target square. =========================(7) epdExtra ============================== "epdExtra" separates the records of the input "epd" file based on the number of "extra" promoted pieces. An extra promoted piece is a 2nd or more Queen, a 3rd or more Rook, Bishop, or Knight. "epdExtra" separates the records of an "epd" file into 3 files: (a) "outX0.epd" : neither side has an "extra" promoted piece. (b) "outX1.epd" : one or both sides has exactly one "extra" promoted piece and neither side has more than one "extra" promoted piece. (c) "outX2.epd" : one side or both sides have 2 or more "extra" promoted pieces. When a Pawn is promoted, it does not always result in an "extra" promoted piece. For example, if one side has previously lost its Queen, a new Queen by promotion is not "extra" because that side now has just 1 Queen. If another Pawn is promoted to a Queen, that side will have 2 Queens and therefore one "extra" promoted piece. Examples: 1. White and Black do not have any "extra" promoted pieces. The record belongs in "outX0.epd". Zero "extra" promoted pieces. 2. White has 2 Queens. The record belongs in "outX1.epd". White has 1 "extra" promoted piece. 3. White has 2 Queens and Black has 2 Queens. This record belongs in "outX1.epd". White and Black EACH have 1 "extra" promoted piece. 4. White has 2 Queens and Black has 3 Rooks. The record belongs in "outX1.epd". White and Black EACH have 1 "extra" promoted piece. 5. White has 3 Queens and Black has 2 Queens. The record belongs in "outX2.epd". White has 2 "extra" promoted pieces in total. 6. Black has 4 Rooks. The record belongs in "outX2.epd". Black has 2 "extra" promoted pieces in total. 7. White has 3 Knights and 3 Bishops. The record belongs in "outX2.epd". White has 2 "extra" promoted pieces in total. Syntax: epdExtra filename.epd Usage: epdExtra alpha.epd Output: outX0.epd, outX1.epd, outX2.epd =========================(8) epdFin / pgnFin ======================= "epdFin" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the final position. Optionally, "epdFin" can output a user-specified number of records from the end of each game. "PGN-Extract" outputs an intermediate "epd" file, "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. Then "epdFin" inputs "temp.epd" and outputs the last record of each game to "outF.epd". If the user specifies the optional "record_number", then that number of records will be output from the end of each game. If a game has fewer records than the user-specified "record_number", then all of its records will be output. Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdFin temp.epd 3 will output the final 3 "epd" records from each game. If 2 or more records from the end of each game are requested, then the output file inserts a blank line between each output set. Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdFin temp.epd [record_number] Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdFin temp.epd Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdFin temp.epd 20 Output: outF.epd When using "epdFin" by itself: Syntax: epdFin filename.epd [record_number] Output: outF.epd Comments: 1. Because "epdFin" requires two tools to be executed successively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnFin.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdFin temp.epd %2 Syntax: pgnFin filename.pgn [record_number] Example: pgnFin beta.pgn Example: pgnFin beta.pgn 5 2. "pgnFin.cmd" is included in the "40H-EPD" download. 3. The output file "outF.epd" can be used by "epdOccur" to see which games have the same final position. In this case, you should number comment each game using "gameNum" from "40H-PGN". 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(9) epdFlip ================================ "epdFlip" reverses the colors and does a vertical reflection about the imaginary horizontal line between ranks "4" and "5". The color to move, "castling" rights and "en passant" rights are also reversed. The new "epd" record is logically equivalent to the old "epd" record. r2qkb1r/pp1n1ppp/2p2n2/3pp3/6b1/3P1NP1/PPPNPPBP/R1BQ1RK1 w kq e6 r1bq1rk1/pppnppbp/3p1np1/6B1/3PP3/2P2N2/PP1N1PPP/R2QKB1R b KQ e3 To get a visual comparison, open 2 instances of your "PGN viewer". Put the original "epd" record in one instance, and put the "flipped" "epd" record in the other. Be sure both "PGN Viewers" have the same "View". For example, neither is in "flip" view mode. Compare. The output files of "epdFlip" do not retain opcodes since flipping could make the original opcodes incorrect. The output has 3 files: "outZ.epd", "outZB.epd", and "outZW.epd". "outZ.epd" contains the flipped "epd" records. All opcodes are removed. "outZB.epd" contains the original "Black to move" records and the flipped "White to move" records. All records in "outZB.epd" have "Black to move". All opcodes are removed. "outZW.epd" contains the original "White to move" records and the flipped "Black to move" records. All records in "outZW.epd" have "White to move". All opcodes are removed. If the user wishes to restore the opcodes, here is what he should do. First, save the opcodes to "hlist" with "epdTrim". Second, run "epdFlip". Third, restore the opcodes to the output file with "idOpcode". For example: epdTrim alpha.epd epdFlip alpha.epd copy hlist inlist idopcode inlist outZ.epd Syntax: epdFlip filename.epd Example: epdFlip alpha.epd Output: outZ.epd, outZB.epd, outZW.epd Comments: 1. "epdColor" separates "epd" records by the color to move. 2. "epdTrim" can remove opcodes and save them. 3. "epdInsert" can insert opcodes, =========================(10) epdImbalance ========================== "epdImbalance" separates the records into two files. One for records where the opposing sides have different material (imbalanced), and the other for records where the opposing sides have the same material (balanced). "outIM.epd" contains the records where the two sides DO NOT have the same set of pieces (imbalanced) and "excludeIM.epd" contains the records where the two sides have the same set of pieces (balanced). Syntax: epdImbalance filename.epd Example: epdImbalance alpha.epd Output: outIM.epd, excludeIM.epd Comment: 1. No distinction is made between bishops that move on different colored squares ("light"/"dark"). =========================(11) epdInsert ============================ "epdInsert" appends new opcodes to the records. "epdInsert" can be used in conjunction with "bmOpcode" and "idOpcode". New opcodes are first put into a text file "inlist". Sample "inlist" file: bm a4; bm Be2; c1 stalemate; The above 3 opcodes will be appended to records 1-2-3 of the "epd" input file. The user MUST check "inlist" for accuracy and line number. The new opcodes are attached to the records in the "epd" input file with the same line numbers. Blank lines are permitted in "inlist" and are necessary if the corresponding records in the "epd" input file are to be unchanged. "inlist" must be located in the Working Folder and cannot be referenced using a pathname. Syntax: epdInsert inlist filename.epd Example: epdInsert inlist alpha.epd Output: outN.epd =========================(12) epdInsuff / pgnInsuff ================ "epdInsuff / pgnInsuff" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they extract drawn games that end with insufficient checkmating material. There are 4 piece combinations that are insufficient for a checkmate. They are (1) Kk, (2) KBk / Kkb, (3) KNk / Kkn, and (4) KNNk / Kknn. In each of the 4 piece combinations, no additional pieces are on the chessboard. "epdInsuff" extracts all positions from the input "epd" file that have one of the four combinations. These "epd" positions are extracted to two files. The first file is a general file for any insufficient material position, and the other file is specific to the type of piece combination. There are five output files for the extracted "epd" positions: "outUT.epd" contains any position that has one of the four piece combinations (with no additional pieces). "outU0.epd" contains any position that has the "Kk" piece combination (with no additional pieces). "outU1.epd" contains any position that has the "KBk / Kkb" piece combination (with no additional pieces). "outU2.epd" contains any position that has the "KNk / Kkn" piece combination (with no additional pieces). "outU3.epd" contains any position that has the "KNNk / Kknn" piece combination (with no additional pieces). "epdInsuff" also outputs five "number" files that contain the game numbers of the games in the input "pgn" file that contain the "epdInsuff" output positions. These "number" files can be used by the "40H-PGN" tool to extract those games from the input "pgn" file. The "number" files are: "numsT" contains the game numbers of the "pgn" games containing an insufficient material piece combination (with no additional pieces). "nums0" contains the game numbers of the "pgn" games containing a "Kk" piece combination (with no additional pieces). "nums1" contains the game numbers of the "pgn" games containing a "KBk / Kkb" piece combination (with no additional pieces). "nums2" contains the game numbers of the "pgn" games containing a "KNk / Kkn" piece combination (with no additional pieces). "nums3" contains the game numbers of the "pgn" games containing a "KNNk / Kknn" piece combination (with no additional pieces). "epdInsuff" can be used with "pgn-Extract" by David Barnes and "numExtract" from "40H-PGN" to input a "pgn" file and output games that end in a position with insufficient material to checkmate. Because a player has to claim a "draw", games with insufficient checkmating material can (needlessly) continue if no "draw" is claimed. In such a scenario, it is possible for the piece combination to be reduced. For example, "KNNk" can reduce to "KNk" and then to "Kk". ---------- When using "epdInsuff" by itself: Usage: epdInsuff alpha.epd Output: outUT.epd, outU0.epd, outU1.epd, outU2.epd, outU3.epd numsT, nums0, nums1, nums2, nums3 --------- When using "epdStuff" with "PGN-Extract": Usage: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdInsuff temp.epd copy numsT numbers numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn, numsT, nums0, nums1, nums2, nums3, outUT.epd, outU0.epd, outU1.epd, outU2.epd, outU3.epd, temp.epd, -------- Comments: 1. Since you need to use several steps to extract "pgn" games containing a position with insufficient material, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnInsuff.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdInsuff temp.epd copy numsT numbers numExtract numbers %1 Usage: pgnInsuff beta.pgn Output: outZ.pgn, excludeZ.pgn The output file "outZ.pgn" contains those games having one or more positions with insufficient material for a checkmate. Multiple positions are possible if the game needlessly continues. No game is output more than once. 2. Sometimes players keep on playing after reaching a position where there is insufficient material to checkmate. This can cause a game to have multiple positions with the same insufficient material. It can also cause a game to have more than one piece combination with insufficient material. For example, a "KNNk" game can lead to a "KNk" game, which in turn can lead to a "Kk" game. 3. "pgnInsuff.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(13) epdKings ============================= "epdKings" extracts records where the Kings are advanced beyond each other. In other words, a record is extracted if the White King's rank (row) is higher than the Black King's rank (row). In the opening position, the White pieces are on Rank 1, and the Black pieces are on Rank 8. An example of the type of position being extracted is the White King on Rank 5 and the Black King on Rank 4. "outG.epd" contains the extracted records. "excludeG.epd" contains the remaining records. Syntax: epdKings filename.epd Example: epdKings alpha.epd Output: outG.epd, excludeG.epd =========================(14) epdMask / pgnMask ==================== "epdMask" extracts records based on a minimal user-specified "pieces/location structure" such as "2kr4/8/8/8/8/8/8/2KR4". "epdMask" can also be used in combination with "PGN-Extract" by David Barnes. The "pieces/location structure" cannot have any embedded spaces, and does not have to be enclosed in quotation marks. The "pieces/location structure" is a minimal list. Kings are NOT required to be part of the structure. The extracted records will usually contain additional pieces. The output file "outK.epd" contains the extracted records. The output file "outK2.epd" adds the input line number to each record by using an opcode named "line". The next two output files are ONLY useful when "epdMask" is used in combination with "PGN-Extract". The output file "outK3.epd" adds the input game number and ply number to each record by using opcodes "game" and "ply" respectively. The output file "numbers" lists the game numbers of the games that produced extracted records. "numbers" is used as input to the "40H-PGN" tool "numExtract" to extract the "pgn" games. When using "epdMask" by itself: Syntax: epdMask filename.epd epd_structure Output: outK.epd, outK2.epd, outK3.epd, numbers To extract games from a "pgn" file that contains a position with the user-specified "pieces/location structure", you would execute these commands: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMask temp.epd epd_structure numExtract numbers filename.pgn "PGN-Extract" outputs "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. "epdMask" inputs "temp.epd" and the user-specified "pieces/location structure". Then each "epd" record, where the structure is found, is output to "outK.epd", "outK2.epd" and "outK3.epd". Also, the game number is output to the file "numbers". "numExtract" inputs the original "pgn" file and the file "numbers" and extracts the games. Only one instance of the game is output even if the game has multiple positions with the structure. When using "epdMask" by itself: Syntax: epdMask filename.epd epd_structure Example: epdMask alpha.epd 2kr4/8/8/8/8/8/8/2KR4 Output: outK.epd, outK2.epd, outK3.epd, numbers When using "epdMask" in combination with "PGN-Extract": Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMask temp.epd epd_structure numExtract numbers filename.pgn Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdMask temp.epd 2kr4/8/8/8/8/8/8/2KR4 numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn The output file "outZ.pgn" contains those games having a position with the user-specified structure. Comments: 1. "epdMask" output matches the user-specified "pieces/location structure" MINIMALLY. That means that additional pieces can be present. 2. Since You need to use several steps to extract "pgn" games containing a position with the user-specified structure, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnMask.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdMask temp.epd %2 numExtract numbers %1 Syntax: pgnMask filename.pgn epd_structure Example: pgnMask beta.pgn 2kr4/8/8/8/8/8/8/5RK1 Output: outZ.pgn, excludeZ.pgn 3. "pgnMask.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(15) epdMaterial/pgnMaterial ============== "epdMaterial" extracts records based on a user-specified text file "pieces" that lists piece/quantity values. "epdMaterial" can also be used in combination with "PGN-Extract" by David Barnes. To create the text file "pieces", use the "epd" notation for the different pieces followed by a space and then the quantity. Each piece must be on a separate line. Kings are not listed since there must be 1 for each side. For example, "pieces" could be: N 2 R 0 Q 1 b 0 r 2 p 4 q 0 The above piece/quantity values require an extracted record to have (in addition to the 2 kings) exactly 2 White knights, 0 White rooks, 1 White queen, 0 Black bishops, 2 Black rooks, 4 Black pawns and 0 Black queens. Note that an extracted record must not have a White rook ("R"), a Black bishop ("b") or a Black queen ("q") because they were assigned a value of 0. Also note that the extracted records do not have a requirement for the White pawns, White bishops, or Black knights because "P", "B", and "n" were omitted from "pieces". Those pieces can be present in any legal quantity. Pieces and their quantities can be listed in "pieces" in any order. The use of the file "pieces" allows the user to specify, or NOT specify, an exact quantity for each of the 10 types of pieces: (P, N, B, R, Q, p, n, b, r, q). When using "epdMaterial" by itself: Syntax: epdMaterial pieces filename.epd Example: epdMaterial pieces alpha.epd Output: outV.epd, outV2.epd, outV3.epd, numbers The output file "outV.epd" extracts those records in the input "epd" file whose material is specified in "pieces". The output file "outV2" adds the line numbers from the input "epd" file by using an opcode named "line". The next two output files are ONLY useful when "epdMaterial" is used in combination with "PGN-Extract". The output file "outV3.epd" adds the game numbers from the input "pgn" file, and the ply numbers from those games, by using opcodes "game" and "ply". The output file "numbers" lists the game numbers of the "pgn" games that produced the extracted records. "numbers" is used as input to the "40H-PGN" tool "numExtract" to extract the "pgn" games. To extract games from a "pgn" file that contains a position with the material specified in "pieces", you would execute these commands: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMaterial pieces temp.epd numExtract numbers filename.pgn "PGN-Extract" outputs an "epd" file, "temp.epd", which contains ALL the positions ("epd" records) encountered in the input "pgn" file. A blank line separates successive games. "epdMaterial" inputs the user-specified text file "pieces" and the file "temp.epd". It then outputs each record having the material specified in "pieces" to "outV.epd", "outV2.epd" and "outV3.epd". It also outputs a list of game numbers that produced the extracted records to the file "numbers". "numExtract" inputs the file "numbers" and the original "pgn" file and extracts games containing a position having the material specified in "pieces". Only one instance of the game is output even if the game has multiple positions with the material specified by "pieces". When using "epdMaterial" in combination with "PGN-Extract": Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdMaterial pieces temp.epd numExtract numbers filename.pgn Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd beta.pgn epdMaterial pieces temp.epd numExtract numbers beta.pgn Output: outZ.pgn, excludeZ.pgn Comments: 1. "pieces" is case-sensitive. 2. Since several steps are needed to extract "pgn" games containing a position with the user-specified pieces, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnMaterial.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %2 epdMaterial pieces temp.epd numExtract numbers %2 Syntax: pgnMaterial pieces filename.pgn Example: pgnMaterial pieces beta.pgn Output: outZ.pgn, excludeZ.pgn 3. "pgnMaterial.cmd" is included in the "40H-EPD" download. 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(16) epdOccur ============================== "epdOccur" lists the number of occurrences of each distinct position in the input "epd" file. An "epd" position consists of the first 4 tokens of an "epd" record. "epdOccur" also lists the line numbers where the positions occurred. Positions having the same "piece placement", "side to move", "castling rights' and "en passant" target square values, are considered to have the same position. All four items must agree. Records with the same position can have different opcodes. A record with a faux "en passant" target square will have a different position compared to the same record not having a target square. Therefore the user should remove faux "en passant" target squares before using "epdOccur". The tool "epdEnPass" can be used to remove them. The output files are "outL1.epd" and "outL2.epd". "outL1.epd" contains opcode "c0" which lists the number of occurrences of the position. "outL2.epd" also contains the opcode "c1" which lists the line numbers of the occurrences. Original opcodes are omitted in the output files. Example of an output line in "outL2.epd": 3Q4/p3b1k1/2p2rPp/2q5/4B3/P2P4/8/6RK w - - c0 2; c1 line(s): 467 536; c0 indicates two occurrences of the position and c1 indicates that they are on lines 467 and 536. If you want to remove duplicate "epd" positions, then you should use "epdSingle". Syntax: epdEnPass filename.epd epdOccur outEP.epd Output: outL.epd, outL2.epd Comments: 1. Removing faux "en passant" target squares from the "epd" input file is necessary to ensure full and accurate output from "epdOccur". This can be accomplished by first using "epdEnPass". Or, if the file was created by "PGN-Extract", be sure the "--nofauxep" parameter was used. 2. "txtOccur" lists the occurrences of full lines whereas "epdOccur" lists the occurrences of records containing the same position. 3. "epdOccur" ignores blank lines and does not output blank lines. =========================(17) epdOrder ============================= "epdOrder" sorts the records in descending order based on the "centipawn evaluation" ("ce" opcode) of the position from White's perspective. The "ce" opcode evaluates the strength of the position from the viewpoint of the White pieces. The record with the highest "ce" value (the strongest position for White) is listed first and the record with the lowest "ce" value (the weakest position for White) is listed last. "ce" opcode values are expressed in "centipawns". A "centipawn" is one-hundredth of a pawn. Examples: "137" and "-137". A "ce" value of 137 means that White is evaluated to be winning by the equivalent of 1.37 pawns. Similarly, a "ce" value of -137 means that White is evaluated to be losing by the equivalent of 1.37 pawns. In addition, there are "ce" values that start with "+M" and "-M". Examples: "+M5" and "-M5". These values indicate the number of moves to checkmate with ACCURATE play. A value of "+M2" is rated higher than "+M5" because White can mate in fewer moves. Likewise "-M5" is rated higher than "-M2" because White gets mated in a greater number of moves. The "M" values are technically not part of "EPD Standards" but are commonly used as they are easily converted to and from the numerical values that "ce" uses. Syntax: epdOrder filename.epd Example: epdOrder alpha.epd Output: outD.epd =========================(18) epdPawnDifference ==================== "epdPawnDifference" separates records based on the difference in the number of pawns of the two sides. "outPD.epd" contains the records where the absolute value of the difference in the number of pawns of the two sides equals or exceeds a user-specified number (1-8). "excludePD.epd" contains the remaining records. The optional "min_pawn_difference" parameter must be a number from "1" to "8". For example: epdPawnDifference alpha.epd 1 "outPD" contains records where the absolute value of the difference in the number of pawns is "1" or more, and "excludePD.epd" contains the remaining records. epdPawnDifference alpha.epd 3 "outPD" contains records where the absolute value of the difference in the number of pawns is "3" or more, and "excludePD.epd" contains the remaining records. Syntax: epdPawnDifference filename.epd min_pawn_difference Examples: epdPawnDifference alpha.epd 1 epdPawnDifference alpha.epd 5 Output: outPD.epd, excludePD.epd =========================(19) epdPieces ============================ "epdPieces" extracts records based on a user-specified number range for the number of chess pieces on the board. Usage examples: epdPieces alpha.epd 10 15 will output "epd" records containing 10 to 15 pieces inclusive. epdPieces alpha.epd 11 11 will output "epd" records containing exactly 11 pieces. The output file "outP.epd" contains the "epd" records with the number of pieces in the user-specified number range. The output file "outP2.epd" adds the input line number to each record by using an opcode named "line". The output file "lineNums" lists the line numbers of the output records in the input "epd" file. Syntax: epdPieces filname.epd min_pieces max_pieces Examples: epdPieces alpha.epd 10 10 epdPieces alpha.epd 8 11 Output: outP.epd, outP2.epd, lineNums =========================(20) epdPly / pgnPly ====================== "epdPly" is used in combination with the utility tool "PGN-Extract" by David Barnes. Together they input a "pgn" file. Then for each game, they output the "epd" record of the position after a user- supplied number of plies. "PGN-Extract" outputs an intermediate "epd" file, "temp.epd", which contains all the "epd" records (positions) encountered in the input "pgn" file. A blank line separates successive games. Then "epdPly" inputs "temp.epd" and a user-specified ply number. Then for each game, the "epd" record for the position after the user-specified ply is output to "outY.epd". If the record does not exist, a blank line is output. For example, if 0 is the user-specified ply number, the first record of each game is output. For example, if 24 is the user-specified ply number, the 25th record of each game, if it exists, is output. If it does not exist, a blank line is output. Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdPly temp.epd ply_number Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdPly temp.epd 30 Output: outY.epd When using "epdPly" by itself: Syntax: epdPly filename.epd ply_number Output: outY.epd Comments: 1. Because "epdPly" requires two tools to be executed successively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnPly.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %1 epdPly temp.epd %2 Syntax: pgnPly filename.pgn ply_number Example: pgnPly beta.pgn 30 2. "pgnPly.cmd" is included in the "40H-EPD" download. 3. The output file "outY.epd" can be used by "epdOccur" to see which games have the same position after the user-specified ply. In this case, you should number comment each game using "gameNum" from "40H-PGN". 4. All files mentioned within the batch script must be in the Working Folder or on the Path. =========================(21) epdPosition / pgnPosition ============ "epdPosition" inputs a user-specified "epd" search file and an "epd" input data file, then outputs records in the input data file that match the first and second fields of a position in the search file. "epdPosition" can be used in combination with the utility tool "PGN-Extract", by David Barnes, and the "40H-PGN" tool "numExtract". Used together, they input a user-specified "epd" search file and a "pgn" game file, then outputs the "pgn" games matching a position in the "epd" search file. "epdPosition" seeks to match both the first and second fields of each record in the "epd" input file with the corresponding fields of a record in the "epd" search file. An example of the first and second fields of an "epd" record: rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w The first field above is the "pieces/location" field and the second field is the "side to move" field. The above is not a full "epd" record because the castling rights field and the "en passant" target square field are missing. "epdPosition" outputs two files, "manifest-ps" and "numbers". For each "epd" match, "manifest-ps" lists the "epd" record from "search.epd" and the "epd" record from the input "epd" file. It then lists the source of the input "epd" record. In the case where "PGN-Extract" was used ("pgn" input), the game number and ply number in the input "pgn" file are listed. In the case where "epdPosition" is used by itself ("epd" input), the line number record in the input "epd" file is listed. There may be some differences between the two matching "epd" records in "manifest-ps". They might differ with regard to castling rights, "en passant" target square, or opcodes. This is because the input "epd" records were only matched on the first two fields. "numbers" lists the game numbers where the matches occurred, assuming "PGN-Extract" was used. It is an input file for "numExtract". When used with "PGN-Extract", "PGN-Extract" precedes "epdPosition". "PGN-Extract" inputs a "pgn" file and outputs an "epd" file containing the positions of the "pgn" games. The "epd" file produced by "PGN-Extract" is processed as the "epd" input file by "epdPosition". The "numbers" file produced by "epdPosition" is processed by "numExtract" along with the original "pgn" file. "numExtract" outputs "outZ.pgn" which contains the games containing the matching positions. A game with a match is output only once, even if the game has multiple matches. However, each match will be listed separately in "manifest-ps". If "search.epd" contains more than one record, you should use the "40H-PGN" tool "gameNum" on the input "pgn" file before using "PGN-Extract". This will make it easier to match "pgn" games and "epd" records. When using "epdPosition" by itself: Syntax: epdPosition search.epd filename.epd Example: epdPosition search.epd alpha.epd Output: manifest-ps, numbers When using "epdPosition" with "PGN-Extract" and "numExtract": Syntax: PGN-Extract -Wepd --nofauxep -s -otemp.epd filename.pgn epdPosition search.epd temp.epd numExtract numbers filename.pgn Example: PGN-Extract -Wepd --nofauxep -s -otemp.epd alpha.pgn epdPosition search.epd temp.epd numExtract numbers alpha.pgn Output: manifest-ps, numbers, outZ.pgn Comments: 1. Because three tools are to be executed successively, a "Windows" batch script ("cmd" file) may be used for convenience. You can copy and paste the following code into a text editor, and save it with the name "pgnPosition.cmd" : PGN-Extract -Wepd --nofauxep -s -otemp.epd %2 epdPosition %1 temp.epd numExtract numbers %2 Examples: pgnPosition search.epd alpha.pgn 2. "pgnPosition.cmd" is included in the "40H-EPD" download. 3. "pgnPosition.cmd" combined with a very large "pgn" database will enable the user to find many "proof games" (moves from the starting position to a selected "epd" position). 4. "temp.epd" is produced by "PGN-Extract". It is a very large file because it contains the "epd" record of each position in the input "pgn" file. It should NOT be deleted too soon because you may want to reuse it and it takes a long time to produce. =========================(22) epdRemove ============================ "epdRemove" removes a user-specified opcode from the "epd" input file and saves the removed opcode and its values in a second output file. "epdRemove" outputs the file "outR.epd". "outR.epd" does not contain the specified opcode. "rlist" lists the removed opcode and its values. This tool could also be called "epdOpcodeExtract" because of "rlist". The user can only specify one opcode per execution. If there are multiple instances of the same opcode in a record, all instances will be removed. If you want to use "epdRemove" again, you have to first rename the output file before using it as input file. Use "epdTrim" if you want to remove all opcodes. Blank lines are retained in both "outR.epd" and "rlist". Syntax: epdRemove filename.epd opcode_type Example: epdRemove alpha.epd id Output: outR.epd, rlist =========================(23) epdSingle ============================ "epdSingle" removes records containing the same position as a previous record in the "epd" file. An "epd" position consists of the first 4 tokens of an "epd" record. The removed records are saved. Positions having the same "piece placement", "side to move", "castling rights' and "en passant" target square values, are considered to have the same position. All four items must agree. A record with a faux "en passant" target square will have a different position compared to the same record not having a target square. Therefore remove faux "en passant" target squares before using "epdOccur". The tool "epdEnPass" can be used to remove them. Records with the same position can have different opcodes. The output file is outA.epd. The removed records are saved in excludeA.epd. The output records remain in the original order except for the duplicate records that were removed. Opcodes are not changed. Blank lines are removed. If you are interested in the number of occurrences of each position and, optionally, what line they are on, then you should use "epdOccur". Syntax: epdEnPass filename.epd epdSingle outEP.epd Output: outA.epd, excludeA.epd Comments: 1. Removing faux "en passant" target squares from the "epd" input file is necessary to ensure full and accurate output from "epdSingle". This can be accomplished by first using "epdEnPass". Or, if the file was created by "PGN-Extract", be sure the "--nofauxep" parameter was used. 2. "txtSingle" removes duplicates of full lines whereas "epdSingle" removes lines containing the same position as a previous record. 3. "epdSingle" ignores blank lines and does not output blank lines. =========================(24) epdSort ============================== "epdSort" sorts the records of an "epd" file aphanumerically based on the first four fields of each record. It sorts in ascending or descending order. Default is ascending order. Use the optional parameter "down" for descending order. Blank lines are deleted in the output file. Syntax: epdSort filename.epd [down] Examples: epdSort alpha.epd epdSort alpha.epd down Output: outS.epd Comment: 1. "down" is case-sensitive. 2. Use "txtSort" for sorting the lines of a general text file. =========================(25) epdTrim ============================== "epdTrim" removes all opcodes from the "epd" input file, and saves the removed opcodes two different ways - horizontally and vertically. There are 3 output files. The output file "outT.epd" does not have any opcodes. Blank lines are retained. The output file "hlist" lists the removed opcode values horizontally as they appear in the input file. Blank lines are retained. Sample output in "hlist": bm f5; id "Undermine.001"; c0 "f5=10, Be5+=2, Bf2=3, Bg4=2" bm c5; id "Undermine.002"; c0 "c5=10, Qd4+=4, b5=4, g4=3"; The output file "vlist" lists the line number and then each removed opcode value on a separate line. Sample output in "vlist": line # 1 bm f5 id "Undermine.001" c0 "f5=10, Be5+=2, Bf2=3, Bg4=2" line # 2 bm c5 id "Undermine.002" c0 "c5=10, Qd4+=4, b5=4, g4=3" Syntax: epdTrim filename.epd Example: epdTrim alpha.epd Output: outT.epd, hlist, vlist Comments: 1. Use "epdRemove" to remove one opcode or to list the values of one opcode. =========================(26) epdTriplePawns ======================= "epdTriplePawns" extracts records where there are 3 or more pawns of the same color in the same column. The extracted "epd" records are in "outTP.epd". Other records are in "excludeTP.epd". Syntax: epdTriplePawns filename.epd Example: epdTriplePawns alpha.epd Output: outTP.epd, excludeTP.epd =========================(27) idOpcode ============================= "idOpcode" outputs a file of "id" opcodes, each containing an ID number. The default starting ID number is "1". Succeeding numbers increase by 1. The user can optionally specify a different starting number. The starting ID number can be changed by listing it at the end of the command line. For example: idOpcode alpha.txt 31 will start with "31" as the first ID number. Sample output in output file "idlist": id "1"; id "2"; id "3"; The user can attach the opcodes in "idlist" to the records of the "epd" file by using "epdInsert". For example: idOpcode beta.txt copy idlist inlist epdInsert inlist beta.txt Sample output line: r1bq1rk1/pp2bppp/2n2n2/2pp2B1/3P4/2N2NP1/PP2PPBP/R2Q1RK1 b - - id "4"; Syntax: idOpcode filename.epd [startNum] Examples: idOpcode alpha.epd idOpcode alpha.epd 51 Output: idlist =========================(28) smOpcode ============================= "smOpcode" outputs supplied move ("sm") opcodes from a "pgn" file. Each "sm" opcode corresponds to an actual move in the "pgn" file. For example: sm 14W Qxh7; indicates that White's 14th move is Qxh7. "smOpcode" inputs a "pgn" file and outputs opcodes for attachment in an "epd" file. The output file "smlist" lists the "sm" opcodes one per line. If there are multiple games in the "pgn" file, then "smlist" separates the "sm" opcodes of successive games by inserting two blank lines. The reason for the extra blank line is that no move is played from the final position. The "sm" opcodes can be appended to the "epd" records of the positions where the moves were played. An "epd" file can be created from an input "pgn" file using "PGN-Extract" by David Barnes. Then "epdInsert" can be used to append the "sm" opcodes to the corresponding records of the "epd" file. To append the "sm" opcodes to the "epd" file from "PGN-Extract": 1. PGN-Extract -Wepd --nofauxep -s -obeta.epd beta.pgn 2. smOpcode beta.pgn 3. << User checks smlist against beta.epd >> 4. copy smlist inlist 5. epdInsert inlist beta.epd 6. << Output is outN.epd >> Usage: smOpcode beta.pgn Output: smlist =========================(29) txtColumn =========================== "txtColumn" outputs a contiguous range of columns in a text file. It uses user-specified starting and ending column numbers. The user-specified column numbers must be from 1 to 1000. The column numbers are inclusive. There are 2 output files, "outT.txt" and "excludeT.txt". "outT.txt" contains the extracted columns. "excludeT.txt" contains the columns not in "outT.txt". These columns are condensed and realigned. For example: txtColumn alpha.txt 1 20 outputs columns 1 to 20, inclusive, to "outT.txt". The remaining data, columns 21 to the end, is output to "excludeT.txt". txtColumn alpha.txt 15 30 outputs columns 15 to 30, inclusive, to "outT.txt". The remaining data, columns 1 to 14, and columns 31 to the end, is output to "excludeT.txt". "txtColumn" can be used to truncate the lines of a text file. txtColumn alpha.txt 1 40 truncates each line after column 40. Syntax: txtColumn filename.txt starting_column ending_column Example: txtColumn alpha.txt 21 60 Output: outT.txt, excludeT.txt Comments: 1. The input file is NOT required to have a "txt" extension. 2. The input file cannot contain a "tab" character. Only tabs created by multiple spaces are acceptable. =========================(30) txtMerge ============================= "txtMerge" joins 2, 3, 4 or 5 "txt" files by adding one line from each successive input file, then repeating. Files do NOT have to have the same number of lines. Example: Suppose fileA.txt has 4 lines, fileB.txt has 2 lines and fileC.txt has 3 lines. Using the command: txtMerge fileA.txt fileB.txt fileC.txt The output file "outM.txt" will be: line1 from fileA.txt line1 from fileB.txt line1 from fileC.txt line2 from fileA.txt line2 from fileB.txt line2 from fileC.txt line3 from fileA.txt line3 from fileC.txt line4 from fileA.txt Syntax: txtMerge file1.txt file2.txt [file3.txt file4.txt file5.txt] Examples: txtMerge alpha.txt beta.txt txtMerge alpha.txt beta.txt gamma.txt txtMerge alpha.txt beta.txt gamma.txt delta.txt txtMerge alpha.txt beta.txt gamma.txt delta.txt epsilon.txt Output: outM.txt Comments: 1. The input file is NOT required to have a "txt" extension. =========================(31) txtOccur ============================ "txtOccur" lists the number of occurrences of each distinct non-blank line. It also lists the line numbers where the lines occurred. The output lines in "outL.txt" contain the line from the "txt" file, followed by the comment indicator "c0", followed by the number of occurrences and followed by ";". Sample output line from "outL.txt": Have a nice day! c0 10; The output lines in "outL2.txt" contain what "outL.txt" contains, followed by the comment indicator "c1", followed by the line numbers of the occurrences and followed by ";". Sample output line from "outL2.txt": Have a nice day! c0 10; c1 8 9 10 34 35 38 44 45 46 47; Syntax: txtOccur filename.txt Examples: txtOccur alpha.txt Output: outL.txt, outL2.txt Comments: 1. "txtOccur" lists the occurrences of full lines whereas "epdOccur" lists the occurrences of records containing the same position. 2. "txtOccur" ignores blank lines and does not output blank lines. =========================(32) txtSingle =========================== "txtSingle" removes any line that is a duplicate of a prior line. The remaining lines are in their original order. The prior line does NOT have to be immediately prior. It can be any line previous to the current line. The output file is outA.txt. The removed records are saved in excludeA.txt Syntax: txtSingle filename.txt Example: txtSingle alpha.txt Output: outA.txt, excludeA.txt Comments: 1. The input file is NOT required to have a "txt" extension. 2. "txtSingle" removes duplicates of full lines whereas "epdSingle" removes lines containing the same position as a previous record. 3. "txtSingle" ignores blank lines and does not output blank lines. =========================(33) txtSort ============================= "txtSort" sorts the lines of a text file alphanumerically. It sorts in either ascending or descending order. Default is ascending order. Use the optional parameter "down" for descending order. Blank lines are deleted in the output file. Syntax: txtSort filename.txt [down] Examples: txtSort alpha.txt txtSort alpha.txt down Output: outS.txt Comments: 1. The input file is NOT required to have a "txt" extension. 2. "down" is case-sensitive. 3. Use "epdSort" for specifically sorting the records of an "epd" file. =========================(34) txtSplit ============================ "txtSplit" splits a very large text file into as many as five separate text files. Splitting can be repeated. The user specifies a "split_number" from 2 to 5 for the number of output files. The number of lines in each output file will be equal or +/- 1. "txtSplit" can be reused on the output files if they also are too big. However, you have to rename the output files before using them as input files. The number of lines in each output file will be approximately equal. Syntax: txtSplit filename.txt split_num Example: txtSplit alpha.txt 3 Output: pt1.txt, pt2.txt, pt3.txt Comments: 1. The input file is NOT required to have a "txt" extension. 2. Use "copy /b" to concatenate the files back to the original. The "/b" option prevents the "eof" character from being appended to the end of the file. =========================(35) txtZero ============================= "txtZero" aligns a column of integers in a text file by inserting leading spaces or leading zeros. The output file "outR.txt" inserts leading spaces. The output file "outR2.txt" inserts leading zeros. Example- Suppose the input data file alpha.txt is: 32 9 567 12928 90 8021 Using the command line: txtZero alpha.txt The output file "outR.txt" will be: 32 9 567 12928 90 8021 The output file "outR2.txt" will be: 00032 00009 00567 12928 00090 08021 "txtZero" is helpful if you want to sort numbers "numerically" using an alphanumeric sort. The original list of numbers will sort alphanumerically as: 12928 32 567 8021 9 90 The output file "outR2.txt", with leading zeros, will sort as: 00009 00032 00090 00567 08021 12928 Syntax: txtZero textfile_name Usage: txtZero alpha.txt Output: outR.txt. outR2.txt Comment: 1. The input file is NOT required to have a "txt" extension. ==================================================================== ====================================================================