Add filter file support when creating Resource Groups. by CCPCookies · Pull Request #18 · carbonengine/resources

CCPCookies · 2026-03-04T15:33:19Z

Changes

Filter file rules loading through legacy INI file format support added.
Filter logic matched from resfileserver and eve-resparser.
Documentation of filter file format added.
Documentation covering filter logic added.
Test coverage added for all known filter include scenarios.
Added new global filter rule which is useful for excluding .red files.
Filter logic for 'resfile' field not covered as it is covered by respaths logic.
New publicly exposed library function added to create Resource Groups with filtering.
Logic added to library function to allow skipping empty search directories.
Logic added to library to allow ascertaining compression size from remote filesystem.
Test coverage for library function added.
CLI extended to add Resource Group creation with filters.
Test coverage for CLI operation added.

Version

Rather than changing 'CreateFromDirectory' in library and 'create-group' in CLI. A new function and operation was added to create Resource Groups using filters. This was to keep the API stable as there are now other parties working with the origional commands. Version minor was bumped.

Extra

Filter ini file parsing logic is from PR16

Changes ------- * Filter file rules loading through legacy INI file format support added. * Filter logic matched from resfileserver and eve-resparser. * Documentation of filter file format added. * Documentation covering filter logic added. * Test coverage added for all known filter include scenarios. * Added new global filter rule which is useful for excluding .red files. * Filter logic for 'resfile' field not covered as it is covered by respaths logic. * New publicly exposed library function added to create Resource Groups with filtering. * Logic added to library function to allow skipping empty search directories. * Logic added to library to allow ascertaining compression size from remote filesystem. * Test coverage for library function added. * CLI extended to add Resource Group creation with filters. * Test coverage for CLI operation added. Version ------- Rather than changing 'CreateFromDirectory' in library and 'create-group' in CLI. A new function and operation was added to create Resource Groups using filters. This was to keep the API stable as there are now other parties working with the origional commands. Version minor was bumped. Extra ----- Filter ini file parsing logic is from PR16

ccp-chargeback · 2026-03-05T13:00:53Z

tools/src/Downloader.cpp

+			if( findResult != std::string::npos )
+			{
+				// Delimiter found
+				value = line.substr( findResult + 2 );


Why is this +2?
Shouldn't this be just +1?
Unless the attributeName variable contains a ":" at the end, which then has the DELIMITER added to it?

Removes a space, I could add this to delimiter to be clearer

ccp-chargeback · 2026-03-05T14:18:52Z

cli/src/CreateResourceGroupFromFilterCliOperation.cpp

+
+#include "CreateResourceGroupFromFilterCliOperation.h"
+
+#include <string>


Minor, order of includes as per coding guidelines
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#order-of-includes

ccp-chargeback · 2026-03-05T14:35:35Z

tools/include/FilterFileReader.h

+
+struct FilterFile
+{
+	std::unordered_map<std::string, std::shared_ptr<Prefix>> prefixes;


NOTE when you iterate over this std::unordered_map, you will not get the items returned in the insertion order.
I "think" you may want it to return items in insertion order as part of the ResourceFilter::SetFromFilterFileData() function, that is calling m_prefixPaths.push_back(), which is NOT guaranteed as currently implemented.

Yes you are right, nice catch

I will also add a test to enforce the order importance

ccp-chargeback · 2026-03-05T14:57:48Z

include/ResourceGroup.h

 	/// @note No file filtering supported
 	Result CreateFromDirectory( const CreateResourceGroupFromDirectoryParams& params );

+    /// @brief Creates a ResourceGroup from a supplied filter files.


Typo.
... from supplied filter files. (remove the a)
or
... from a supplied filter file. (change to singular, remove s)

ccp-chargeback · 2026-03-05T15:00:35Z

tools/src/FilterFileReader.cpp

+
+	ParseIncludeExcludeRules( globalFiltersStr, fileData.includeRules, fileData.excludeRules );
+
+	// Get section infomration


Typo:
// Get section information

ccp-chargeback · 2026-03-05T15:09:37Z

tools/src/FilterFileReader.cpp

+	ParseIncludeExcludeRules( globalFiltersStr, fileData.includeRules, fileData.excludeRules );
+
+	// Get section infomration
+	for( const auto& sectionName : reader.Sections() )


NOTE:
The INIReader will return Sections in alphabetical order, not the order they appear in the .ini file.
If you want to keep the sections in the order as defined in the file, you will have to read the file manually and find all the sections and then iterate over that list (instead of reader.Sections())

I.e. change it to do:
std::vectorstd::string sectionsInOrder = ManuallyReadIniFileSectionsInOrderExcludingDefault();
for( const auto& sectionName : sectionsInOrder)

Yes, I remember you saying this. I don't see a scenario where order of sections matter.

I would also not want to be doing any ini parsing manually, this needs to be supplied by a library. The lib you found so far appears to be sort of ok. Another annoying thing it does is removes all the casing from the section names, not huge but I'd prefer it didn't.

I agree. It's annoying that it lowercases everything.
It also cuts each sectionName to the first either 48 or 50 characters (can't remember which)
But this ini reader was the best fit, because it:

had vcpkg support

emulated the original python ini file implementation.

As for if order of sections matter.
What about if the same respath and prefix are present in two different [namedSections], but one of them has an [someFilter] (include) where the other has the same ![someFilter] (exclude filter)?
What happens there?

This will match as it would with previous system

ccp-chargeback · 2026-03-05T15:45:54Z

doc/source/DesignDocuments/fileFiltering.rst

+
+1. Globally in the ``[DEFAULT]`` section using ``filter =`` .
+2. Section locally in sections using ``filter =`` .
+3. Semi locally to respath adding include/exclude rules to each path ``respaths = prefix1:/* [ include ] ![ exclude ]``


This is correct,
but it renders strange when viewed in a browser (splits the line).
It would be better if the "respaths = prefix1:/* [ include ] ![ exclude ]" would be put on the line below.

ccp-chargeback · 2026-03-05T15:54:34Z

doc/source/DesignDocuments/fileFiltering.rst

+
+2. Section local filters are combined with any filters specified in global filters.
+
+3. respaths filters combine with both global and section filters and importantly these add for all subsequent paths. This is explained more in the following examples.


Isn't this supposed to be:
"...and importantly these add for all subsequent paths WITHIN THE SELECTED SECTION."
I.e. filter defined in [SectionA] will not also be applied to all entries in [SectionB] onwards.

ccp-chargeback · 2026-03-05T15:58:06Z

doc/source/DesignDocuments/fileFiltering.rst

+Two paths will be tested for inclusion: 
+
+``#3`` will use ``respaths = prefix1:/*`` and combine global and section local patterns ``include1`` and ``include2``. This will match the following from the source files:
+1. ``include1.txt``


Add an empty line before the "1. include1.txt", for it to render correctly in a browser.

ccp-chargeback · 2026-03-05T15:58:27Z

doc/source/DesignDocuments/fileFiltering.rst

+2. ``include2.txt``
+
+``#4`` will use ``respaths = prefix1:/* [ include3 ]`` which will extend the section local patterns to include ``include3``. This will match the following source files:
+1. ``include1.txt``


Add empty line, so it renders correctly in a browser.

ccp-chargeback · 2026-03-05T15:58:44Z

doc/source/DesignDocuments/fileFiltering.rst

+3. ``include3.txt``
+
+``#5`` will use ``respaths = prefix2:/*`` and doesn't sepecify any include rules. It will apply the include rules that have been constructed for the section at this point ``include1``, ``include2`` and ``include3``. This may be suprising. So this will match the following source files:
+1. ``Path/include3.txt``


Add empty line to it renders correctly in browser.

ccp-chargeback · 2026-03-05T16:06:27Z

doc/source/DesignDocuments/fileFiltering.rst

+
+    [exampleSection]
+        filter = [ include2 ]                   # 2. Section local include
+        respaths = prefix1:/*                   # 3. respath1


NOTE this is not how multiple (multi-line) respaths are defined in existing res.ini files.
The correct example would look like:

[resCharacterMisc] respaths = res:/Graphics/Character/Global/... res:/Graphics/Character/Female/Skeleton/... res:/Graphics/Character/Female/* res:/Graphics/Character/Male/Skeleton/... res:/Graphics/Character/Male/* res:/Graphics/Character/Unique/...

Where the multi-line entries are within the SAME "respaths" attribute.

Ah yeah, I'll change the documentation. The actual tests don't do this, this is just documentation.

ccp-chargeback · 2026-03-05T16:27:08Z

cli/src/CreateResourceGroupFromFilterCliOperation.cpp

+#include <unordered_set>
+
+CreateResourceGroupFromFilterCliOperation::CreateResourceGroupFromFilterCliOperation() :
+	CliOperation( "create-group-from-filter", "Create a Resource Group from a filter files." ),


mixing singular and plural

ccp-chargeback · 2026-03-06T12:04:25Z

src/ResourceGroupImpl.cpp

 #include <Md5ChecksumStream.h>
 #include <GzipCompressionStream.h>
+#include <cctype>
 #include "ResourceInfo/PatchResourceGroupInfo.h"


Look at include grouping and ordering from:
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#order-of-includes

ccp-chargeback · 2026-03-06T12:07:09Z

src/ResourceGroupImpl.cpp

+		return Result{ ResultType::FAILED_TO_INITIALIZE_RESOURCE_FILTER, errorMsg };
+	}
+
+	statusSettings.Update( StatusProgressType::PERCENTAGE, 0, 5, "Loading filter files" );


This status line has 0, 5 like the line for "Create resource group from filters".
Should the numbers be updated (is this a copy-paste error)?

ccp-chargeback · 2026-03-06T12:21:21Z

src/ResourceGroupImpl.cpp

+
+			if( inputDirectoryStatus.RequiresStatusUpdates() )
+			{
+				float step = static_cast<float>( 100.0 / searchPaths.size() );


The step variable is going to be the same in every iteration of the loop.
Can be calculated outside fo the for loop.

Nah, this way that computation is skipped if not verbose, so we don't calculate something we don't need when not caring about the output.

ccp-chargeback · 2026-03-06T12:27:30Z

src/ResourceGroupImpl.cpp

+				}
+				else
+				{
+					return Result{ ResultType::INPUT_DIRECTORY_DOESNT_EXIST, inputDirectory.string() };


I'm confused.
Why would you ever not want to skip non existent input directories?

Is it only for testing/debug purposes?

No this is due to real world usecase of reduced-resources.

It only syncs files that changed, so in theory it might (and usually does) not a single file in a search directory. If it's not synced then there is no directory. But this is not a fail case.

ccp-chargeback · 2026-03-06T14:41:47Z

src/ResourceGroupImpl.cpp

+
+						ss << "Processing file: "
+						   << filePathRelativeToInputDirectory.string()
+						   << ", Match filter: "


"Match filter:" vs matchSection
I guess this is supposed to be "Match Section:" or "Match Section Id:"
Unless you also reference return the "current include/exclude filter from the CheckPath() function and return that as well. Might be useful for debugging.

I actually was thinking to return the current line number for the path rule so that you can see really well. But not required any further information so didn't bother to skip some computation time.

ccp-chargeback · 2026-03-06T14:47:45Z

src/ResourceGroupImpl.cpp

+							resourceParams.binaryOperation = ResourceTools::CalculateBinaryOperation( entry.path() );
+						}
+
+						Location l;


Can you change the name of the variable to be more descriptive?

ccp-chargeback · 2026-03-06T15:00:39Z

src/ResourceGroupImpl.cpp

+
+								ResourceTools::Response downloadResponse;
+
+								downloadResponse = downloader.GetHeader( resourceUrl, params.compressionCalculationSettings.downloadSettings.retryCount, params.compressionCalculationSettings.downloadSettings.retrySeconds, response );


Question?
Is this the slow part?

No, this is actually a speedy part.

This stops you needing to calculate the compression data again.

ccp-chargeback · 2026-03-06T15:12:46Z

src/ResourceGroupImpl.cpp

+										}
+										catch( std::invalid_argument& )
+										{
+											resourceProcessGranular.Update( CarbonResources::StatusProgressType::WARNING, 0, 0, "Invalid compression data from header information, compression will be calculated." + resourceUrl );


I'm confused.
Where is the compression information being calculated?
Don't you have to
calculateCompressions = true;

If it can get the compression data from the download header then it takes it from there.

ccp-chargeback · 2026-03-06T15:13:50Z

src/ResourceGroupImpl.cpp

+											unsigned long in = std::stoul( contentLengthStr );
+											if( in > std::numeric_limits<uint32_t>::max() )
+											{
+												resourceProcessGranular.Update( CarbonResources::StatusProgressType::WARNING, 0, 0, "Invalid compression data from header information, compression will be calculated." + resourceUrl );


Same question, see comment in the catch block below.

Same question? Below just also says 'same' :D

ccp-chargeback · 2026-03-06T15:14:03Z

src/ResourceGroupImpl.cpp

+										}
+										catch( std::out_of_range& )
+										{
+											resourceProcessGranular.Update( CarbonResources::StatusProgressType::WARNING, 0, 0, "Invalid compression data from header information, compression will be calculated." + resourceUrl );


ccp-chargeback · 2026-03-06T15:21:45Z

tools/include/FilterFileReader.h

+
+#include <filesystem>
+#include <vector>
+#include <unordered_map>


Imports in alphabetical order

ccp-chargeback · 2026-03-06T15:27:15Z

tools/include/ResourceFilter.h

+
+struct FilterPath
+{
+	std::string sectionId;


In other structs / classes you've put an empty line between items.
Missing in this struct.

ccp-chargeback · 2026-03-06T15:29:01Z

tools/include/ResourceFilter.h

+	std::string prefixId;
+	std::string path;
+	std::string matchPattern;
+	std::set<std::string> includeRules;


Question:
Sets are good as they enforce uniqueness.
But is there ever a chance that the order of individual include/exclude elements matters?

Doesn't matter

ccp-chargeback · 2026-03-06T15:32:16Z

tools/include/ResourceFilter.h

+	bool containsLocalIncludeExcludeRules;
+};
+
+class ResourceFilter


General suggestion to include comments for classes:
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#commenting-class-declarations

ccp-chargeback · 2026-03-06T15:33:31Z

tools/include/ResourceFilter.h

+
+	~ResourceFilter();
+
+	bool SetFromFilterFileData( const FilterFile& fileData );


General suggestion to include member function comments
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#commenting-class-member-functions

ccp-chargeback · 2026-03-06T15:34:59Z

tools/src/FilterFileReader.cpp

+{
+}
+
+void FilterFileReader::LoadFromIniFileData( const char* data, size_t dataSize, FilterFile& fileData )


General comment to include class member function comments for input/output parameters and functionality.
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#commenting-class-member-functions

ccp-chargeback · 2026-03-06T15:56:52Z

tools/src/FilterFileReader.cpp

+	}
+}
+
+void FilterFileReader::ParsePrefixMappings( const std::string& prefixStr, std::unordered_map<std::string, std::shared_ptr<Prefix>>& prefixes )


Already mentioned.
Order of entries in prefixmap are not the same as insert order.

Yep, will be changing

ccp-chargeback · 2026-03-06T16:11:50Z

tools/src/FilterFileReader.cpp

+
+		ParseIncludeExcludeRules( filter, filterSection->includeRules, filterSection->excludeRules );
+
+		// Respaths is required


I think you're missing reading the optional "resfile" attribute from the .ini file.
The "resfile" attribute behaves just like the "respaths" attribute.
The only difference being that each [NamedSection] can only have a single "resfile" and "respaths" attributes.
But "respaths" can be multi-line, where as the optional "resfile" attribute is a single-line entry only.

NOTE, "resfile" is intended to define a single file, which can also be one of the multi-line entries of a "respath".

Nope, I retired it :D

ccp-chargeback · 2026-03-06T16:16:57Z

tools/src/FilterFileReader.cpp

+void FilterFileReader::ParseIncludeExcludeRules( const std::string& rulesStr, std::set<std::string>& includeRules, std::set<std::string>& excludeRules )
+{
+
+	std::string s = rulesStr;


The "ruleStr" variable could just be passed into this function by value and then you could just use it directly, instead of doing the extra:
std::string s = ruleStr;

Honestly not really looked at this code too closely, it's pretty much just hooking up the code from your PR. I'll do a pass on it before a take this PR out of draft.

ccp-chargeback · 2026-03-06T16:25:01Z

tools/src/FilterFileReader.cpp

+		if( pathPart.find( "../" ) != std::string::npos )
+		{
+			// Escaping is not supported in respaths
+			throw std::invalid_argument( "Escaping paths not supported for respaths: " + rawPathLine );


Lightbulb moment.
Yeah, you're right.
That makes it so much simpler.

It is never done in our files and also isn't a good pattern as it should be done in the prefixes anyway

ccp-chargeback · 2026-03-06T16:26:38Z

tools/src/ResourceFilter.cpp

+
+#include "ResourceFilter.h"
+
+#include <regex>


Alphabetical order of includes

ccp-chargeback · 2026-03-06T16:29:24Z

tools/src/ResourceFilter.cpp

+namespace ResourceTools
+{
+
+ResourceFilter::ResourceFilter()


The constructor and destructor could be replaced with this definition in the header file:

ResourceFilter() = default; ~ResourceFilter() = default;

ccp-chargeback · 2026-03-06T16:30:16Z

tools/src/FilterFileReader.cpp

+namespace ResourceTools
+{
+
+FilterFileReader::FilterFileReader()


Can be replaced with this in the header file:

FilterFileReader() = default; ~FilterFileReader() = default;

ccp-chargeback · 2026-03-06T16:32:53Z

tools/src/ResourceFilter.cpp

+	m_paths.clear();
+
+	// Populate prefix paths
+	for( auto& prefix : fileData.prefixes )


Already mentioned:
fileData.prefixes is not ordered in the insertion order.

ccp-chargeback · 2026-03-06T16:36:35Z

tools/src/ResourceFilter.cpp

+	}
+
+	// Populate search paths from filter data
+	for( auto& filterSection : fileData.filterSections )


Also note (but I think you've said it doesn't matter), but adding it as a comment to the review, just in case:
fileData.filterSections gets populated in the order returned from
"reader.Sections()", which means it is returned in alphabetical order, not actual order as defined in the .ini file.

Doesn't matter for our application

ccp-chargeback · 2026-03-06T16:42:35Z

tools/src/ResourceFilter.cpp

+				std::unique_ptr<FilterPath> filterPath = std::make_unique<FilterPath>();
+
+				// Normalise path and convert to pattern
+				std::string prefixPathStr = prefixPath.string();


Have you considered using prefixPath.lexically_normal.generic_string()?

It "should" take care of all the . \ and / checks you're manually doing in the lines below.

ccp-chargeback · 2026-03-06T16:45:24Z

tools/src/ResourceFilter.cpp

+	return true;
+}
+
+void ResourceFilter::ConvertResPathToPattern( const std::string& resPath, std::string& pattern ) const


This would be nice:
https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#commenting-class-member-functions

ccp-chargeback · 2026-03-06T16:46:27Z

tools/src/ResourceFilter.cpp

+
+void ResourceFilter::ConvertResPathToPattern( const std::string& resPath, std::string& pattern ) const
+{
+	std::string resPathString = resPath;


If resPath is passed by value, this is not needed.

ccp-chargeback · 2026-03-06T16:48:02Z

tools/src/ResourceFilter.cpp

+	return true;
+}
+
+void ResourceFilter::ConvertResPathToPattern( const std::string& resPath, std::string& pattern ) const


Why not just retrun "pattern", instead of it being a reference variable?

ccp-chargeback · 2026-03-06T16:52:30Z

tools/src/ResourceFilter.cpp

+	return CheckPath( path, sectionId, matchPath );
+}
+
+bool ResourceFilter::CheckPath( const std::filesystem::path& path, std::string& matchSectionId, std::string& matchPath ) const


https://didactic-adventure-egnoryz.pages.github.io/cpp_coding_guidelines.html#commenting-class-member-functions

ccp-chargeback · 2026-03-06T16:54:16Z

tools/src/ResourceFilter.cpp

+{
+	for( auto& filterPath : m_paths )
+	{
+		std::string resolvedPathStr = path.string();


Possible simplification.
Have you considered using prefixPath.lexically_normal.generic_string()?
It should sort out the extra checks being done below.

ccp-chargeback · 2026-03-06T17:00:21Z

tools/src/ResourceFilter.cpp

+			}
+		}
+
+		// Excludes


Shouldn't exclude rules be checked before include rules?

CCPCookies added 3 commits March 4, 2026 15:15

Remove semicolon that was that was causing build failure on macOS

e90228b

Remove redundant whitespace from cli argument in test

d429126

CCPCookies requested a review from ccp-chargeback March 5, 2026 10:03

ccp-chargeback reviewed Mar 5, 2026

View reviewed changes

ccp-chargeback reviewed Mar 6, 2026

View reviewed changes

tools/src/ResourceFilter.cpp

#include "ResourceFilter.h"

#include <regex>

Copy link

Collaborator

ccp-chargeback Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alphabetical order of includes

ccp-chargeback reviewed Mar 6, 2026

View reviewed changes

tools/src/ResourceFilter.cpp

}

}

// Excludes

Copy link

Collaborator

ccp-chargeback Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't exclude rules be checked before include rules?


		#include "CreateResourceGroupFromFilterCliOperation.h"

		#include <string>


		ParseIncludeExcludeRules( globalFiltersStr, fileData.includeRules, fileData.excludeRules );

		// Get section infomration


		2. Section local filters are combined with any filters specified in global filters.

		3. respaths filters combine with both global and section filters and importantly these add for all subsequent paths. This is explained more in the following examples.


		ResourceTools::Response downloadResponse;

		downloadResponse = downloader.GetHeader( resourceUrl, params.compressionCalculationSettings.downloadSettings.retryCount, params.compressionCalculationSettings.downloadSettings.retrySeconds, response );


		~ResourceFilter();

		bool SetFromFilterFileData( const FilterFile& fileData );


		ParseIncludeExcludeRules( filter, filterSection->includeRules, filterSection->excludeRules );

		// Respaths is required

Conversation

CCPCookies commented Mar 4, 2026

Changes

Version

Extra

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!