Commit Graph
35 Commits
Author SHA1 Message Date
qwsdcvghyu89 cf75d4a5d5 feat: add deferred response buffering, TableDataProvider, and stealth improvements
- ApiResponse: add readToBuffer option to defer/stream body instead of eagerly buffering
- TableDataProvider: implement HTML table parser with per-column provider support
- StealthConfig: add 10s page load timeout and copyCookiesFrom parameter for cookie sharing
- StealthUnitDownloader: catch WebDriverTimeoutException on navigation, log warning instead of throwing
- Bump version to 2.9.0
2026-04-03 11:51:12 +11:00
qwsdcvghyu89 b16d17631e Add IDE configs, update Beam version, and enhance RelationalDataProvider
Added JetBrains Rider IDE configuration files and a backup for Beam.Api.csproj. Updated aeqw89.Beam project version to 2.7.0 and package references, including Selenium.WebDriver and System.IO.Hashing. Enhanced RelationalDataProvider to support NextSibling and PreviousSibling relations and configurable traversal distance.
2025-11-23 01:47:53 +11:00
qwsdcvghyu89 580ceb8c3c Add FollowRedirects option to downloader
Introduces a FollowRedirects property to UnitDownloaderOptions and its builder, allowing control over HTTP redirect behavior. Updates UnitDownloader to use this option, following redirects when enabled and reporting progress accordingly.
2025-11-16 01:11:22 +11:00
qwsdcvghyu89 6f37d217db Add Addon record and support for utility addons
Introduces the Addon record to represent browser addons and updates StealthConfig to support loading multiple utility addons per browser. The Firefox driver now installs specified addons from the UtilityAddons array, improving extensibility for browser automation.
2025-11-16 00:37:17 +11:00
qwsdcvghyu89 a20d48ef02 Add uBlock extension support for Firefox driver
Upgrades Selenium.WebDriver to 4.38.0 and adds logic to automatically install the uBlock extension for FirefoxDriver instances. The uBlock extension file is now included in the project and set to copy to output. Warnings are logged if the extension fails to load.
2025-11-16 00:26:56 +11:00
qwsdcvghyu89 f52aa6123b Refactor downloaders to use ByteDocument and add options builders
Replaces generic RawType with ByteDocument in downloaders and context classes, simplifying type usage. Adds builder classes for FailurePredicateOptions, FragmentOptions, SkipPredicateOptions, and UnitDownloaderOptions to improve configuration flexibility. Introduces DownloadTarget enum and SkipPredicate delegate for more granular download control. Refactors Fluent API interfaces and implementations to remove RawType generics and streamline usage. Adds Playwright and Stealth download strategies for extensibility.
2025-11-15 22:51:46 +11:00
qwsdcvghyu89 647b2b0f37 feat: introduce new composable data providers and increment version
- Added `AnchorDataProvider`, `AnchorCollectionDataProvider`, `ContentsDataProvider`, `ContentsArrayDataProvider`, `DropDownDataProvider`, `ListContentDataProvider`, and `ParagraphedContentDataProvider` for enhanced data extraction flexibility.
- Updated project version to 2.5.0.
2025-11-15 20:51:18 +11:00
qwsdcvghyu89 b5faf58b1a feat: add support for remote WebDriver and improve StealthConfig browser logic
- Added `RemoteAddress` property to `StealthConfig` for remote WebDriver support.
- Refactored browser driver creation logic with `DriverDefinition` for enhanced consistency.
- Improved error handling in browser fallback mechanism.
- Incremented project version to 2.4.6.
2025-11-14 04:36:03 +11:00
qwsdcvghyu89 76cf78006b fix: add missing break in StealthConfig browser driver fallback logic 2025-11-14 04:08:34 +11:00
qwsdcvghyu89 18c5ad83da Refactor data providers and update abstractions
- Removed obsolete data providers: `AnchorCollectionDataProvider`, `ContentsDataProvider`, and others, consolidating logic into new composable providers.
- Added `ComposeDataProviders`, `SelectDataProvider`, and `RelationalDataProvider` for improved flexibility and reusability.
- Introduced `IManySelectionComposableDataProvider` interface to support multiple-node selection.
- Enhanced `UnitDownloader` with more robust progress tracking.
- Updated package references and project dependencies for consistency.
- Improved error handling in `StealthConfig` initialization for better fallback on browser drivers.
- Incremented project version to 2.4.5.
2025-11-14 03:41:13 +11:00
qwsdcvghyu89 2958a26e4f Refactor downloaders to use generic options and unify logic
Replaces specialized binary and HTML downloaders with a generic, options-driven UnitDownloader and UnitFragmentDownloader pattern. Introduces UnitDownloaderOptions and builder classes for flexible configuration, updates interfaces and method signatures to support progress reporting, and removes redundant binary-specific classes. Updates Playwright and Stealth downloaders to use the new generic base, and adds improved error handling and reporting. Also updates dependency versions and makes minor API consistency improvements across the Fluent and Models layers.
2025-09-29 21:27:56 +10:00
qwsdcvghyu89 8e60109f5e Add required modifiers and generalize behaviour type
Marked UrlLocation properties as required in ResourceDefinition for improved null safety. Changed OrderedLinkGenerator to use the more general IStateChangeBehaviour instead of NumberedStateChanger, increasing flexibility.
2025-09-27 15:48:14 +10:00
qwsdcvghyu89 94b6c0645c Refactor fluent download pipelines 2025-09-27 15:38:58 +10:00
qwsdcvghyu89 13c6fbaf5f save 2025-09-27 13:37:40 +10:00
qwsdcvghyu89 db9bdecea6 Overall; fixed design of IState.cs and IReadOnlyState.cs, and fixed namespaces in Beam.Abstractions to remove all references of Beam.Abstract. 2025-09-26 14:21:38 +10:00
qwsdcvghyu89 67c6a46b09 chore: update package versions and package references
- Bumped Microsoft.Extensions.Logging packages to version 9.0.9 across all projects.
- Updated aeqw89.Beam project version to 2.1.4.
- Added new transitive package references, including Microsoft.Recognizers.Text.Number, Microsoft.Playwright, EntityFramework, and others.
- Commented out or removed Beam.Temporary.Cli references.
- Enhanced package structure by rearranging content includes and cleaning up redundant package references.
2025-09-24 15:14:30 +10:00
qwsdcvghyu89 7ed05abdb8 refactor: modularize Beam into new projects and interfaces
- Introduced modularity by splitting Beam into new projects: Beam.Abstractions, Beam.Models, and Beam.Downloaders.
- Refactored existing classes into appropriate namespaces and projects.
- Replaced specific implementations with abstractions (e.g., SourceLinkBuilder to LinkBuilder, State to IState, etc.).
- Updated interfaces: added ITemplate, IArticleData, IDownloadReport, and others for improved extensibility.
- Removed deprecated classes like SourceLinkBuilder and StateChangerFactory.
- Enhanced link handling in downloaders by refactoring to use `string` over `SourceLink`.
- Consolidated shared logic under Beam.Abstractions.
2025-09-22 01:51:46 +10:00
qwsdcvghyu89 a7d148a96f Introduce Beam.Fluent and Beam.Models projects
Added new Beam.Fluent and Beam.Models projects with staged download builder and data context models. Refactored and moved model classes from Beam.Temporary.Cli to Beam.Models. Added new data providers and extended DataBindings in Beam.Dynamic. Renamed Beam.Puppeteer to Beam.Playwright and updated related classes. Updated project references and package versions. Removed obsolete and unused files from Beam.Temporary.Cli.
2025-09-18 18:32:25 +10:00
qwsdcvghyu89 849bdcd089 refactor: unify binding & data provider interfaces
- Removed BindingType enum and all related logic from Binding.
- Made Binding implement new IBinding and IKeyed interfaces.
- Moved node selection logic to IBinding.Select; removed Resolve* methods from Binding.
- Added new IBinding interface for XPath/CssPath selection.
- Refactored IDataProvider to generic IDataProvider<T>; removed GetNode.
- Updated ListContentDataProvider and ParagraphedContentDataProvider to use IBinding.
- Added new ContentsDataProvider, ContentsArrayDataProvider, and DropDownDataProvider for flexible data extraction.
- Updated DataBindings to use IDataProvider<T> properties instead of Binding.
- Updated all usages to new interfaces and patterns.
2025-06-30 23:31:39 +03:00
qwsdcvghyu89 87360d75ab refactor: update DownloadEnumerable to use IAsyncEnumerable
The DownloadEnumerable class has been refactored to accept
an IAsyncEnumerable<Ordered<T>> instead of an IAsyncEnumerator<Ordered<T>>.
This change simplifies the class and improves its usability.

This allows for better integration with asynchronous streaming
of data, enhancing performance and flexibility.
2025-06-26 14:52:00 +03:00
qwsdcvghyu89 3569ee0e87 feat: add Empty method and fix link handling
Added a static `Empty` method to `DownloadEnumerable` for
creating an empty instance. Updated link handling in
`SequentialDownloader` to use `AbsoluteUri` instead of
`ToString()`, ensuring correct link representation.

These changes improve usability and ensure consistency in
link formatting.
2025-06-26 14:51:11 +03:00
qwsdcvghyu89 518b600d07 chore: commented out wip code 2025-06-25 22:12:19 +03:00
qwsdcvghyu89 fb76945a9a Merge branch 'master' of https://github.com/qwsdcvghyu89/Beam 2025-06-25 22:10:33 +03:00
qwsdcvghyu89 487fdcc77b ```
feat: add PuppetConfig and integrate with CLI

Introduced a new PuppetConfig class in the Puppeteer
namespace to manage Puppeteer configurations. Updated
the CLI project to reference the Puppeteer project and
added a new method in DownloadBuilder for using a
Puppet manipulator.

This change enables better configuration management
for Puppeteer within the CLI.
```
2025-06-25 22:09:59 +03:00
qwsdcvghyu89 f96844063b Create README.md 2025-06-25 15:16:03 +03:00
qwsdcvghyu89 29149d9d62 Create LICENSE 2025-06-25 15:15:10 +03:00
qwsdcvghyu89 a5cc48a0d3 chore: update version to 1.3.0
Bumped the project version from 1.2.10 to 1.3.0 in the
project file. This change reflects new features and
improvements made in the library.
2025-06-25 13:47:18 +03:00
qwsdcvghyu89 3baa31a7cc feat: add Puppeteer integration for web downloads
This introduces a new Puppeteer-based mechanism for downloading
web content. It provides a flexible way to manipulate pages
during downloads, enhancing the ability to handle dynamic
content and improve the overall download process.
2025-06-25 13:42:24 +03:00
qwsdcvghyu89 2317db9d3f feat: update transformers to use ByteDocument type
Refactor the transformers in the downloader classes to use
ByteDocument instead of byte arrays. This change improves type
safety and clarity in handling document content during
downloads, ensuring that the transformations are more
consistent and maintainable.
2025-06-24 23:45:07 +03:00
qwsdcvghyu89 056e426572 Enhance async capabilities and refactor project structure
Updated project files for `Beam.Dynamic`, `Beam.Exports`, `Beam.Puppeteer`, `Beam.Temporary.Cli`, and `Beam` to include new XML headers, reorganize property groups, and add project references.

Modified `PuppetedUnitDownloader` to support additional parameters for async transformers. Changed return types in `CommonTransformers` to `AsyncTransformer` for asynchronous processing.

Significant refactoring in `DownloadBuilder`, `DownloadContext`, and `DownloadContextBuilder` to introduce generic parameters and improve context management. Updated `SequentialDownloader`, `SequentialFragmentDownloader`, and `UnitDownloader` to accommodate new async transformer types.

Introduced `TypeExtensions` for unique type name generation and added `UnitFragmentDownloaderBinary` for handling binary downloads. Updated solution file to include the new `aeqw89.Beam` project, ensuring proper references across the solution.

These changes enhance the asynchronous capabilities of the Beam library, improve type safety, and streamline the downloading process.
2025-06-23 20:30:09 +03:00
qwsdcvghyu89 482a46b568 Enhance project metadata and refactor core classes
Updated project files for `Beam.Dynamic`, `Beam.Exports`, `Beam.Temporary.Cli`, and `Beam` to include additional metadata and specific package versions. Refactored `DataBindings` and `ResolvedBindings` to records, added a new `Text` property in `Binding.cs`, and introduced `ParseNumbers` in `OnlineCleaner`. New classes `PuppetContext` and `PuppetUnitDownloader` added for Playwright integration. Introduced `ImmutableState` struct and `UnitDownloaderBinary` class for improved download management. Updated tests in `UnitTest1.cs` for number localization. Added `Beam.Puppeteer` project to the solution.
2025-06-23 02:11:19 +03:00
qwsdcvghyu89 a9a22ea23d Added constant state changers to represent singular/repeating states. Added a DownloadContextBuilder to support fluent building patterns. Changed RetryReporter and DownloadReporter to use RetryReport and DownloadReport structs to simplify type declarations. Made MainArchitecture obsolete by supporting a fluent downloads with DownloadBuilder. Created a 'budge' OpenAI bridge for proof-of-concept translation. 2025-06-07 00:56:26 +03:00
riley a086cfa02b Introduced some unit testing. Cleaned up some classes in Beam. Overhauled source link generation. 2025-05-10 17:20:33 +03:00
riley bfdcdb1f3b Add project files. 2025-04-19 20:47:58 +03:00
riley 9e14d137ae Add .gitattributes and .gitignore. 2025-04-19 20:47:55 +03:00