Microsoft Technologies based on the .NET software framework. Miscellaneous topics that do not fit into specific categories.
Hi Andrew,
While I understand the frustration of HttpClient behaving differently than a browser, there are a few technical red flags in your implementation that likely explain why the server is dropping your connections.
The .Result Anti-Pattern
Using Client.GetStringAsync(Url).Result is a major anti-pattern in .NET. By forcing an asynchronous task to run synchronously, you risk:
Thread Pool Starvation: Especially during "batches," you can exhaust the threads available to your app.
Deadlocks: Depending on your synchronization context, this is a common source of hangs.
Exception Masking: You mentioned the AggregateException making error handling cumbersome; that is a direct result of using .Result. If you use Await, the first inner exception is captured directly, making your code much cleaner.
Rate Limiting & Fingerprinting (The "10054" Error)
The fact that failures happen in batches on only one site is a textbook symptom of Rate Limiting or Anti-Bot protection (like Cloudflare).
Socket Error 10054 (Connection forcibly closed) is exactly how a Web Application Firewall (WAF) behaves when it detects a "scraping" pattern. It doesn't send a polite 429 status; it kills the TCP socket to save resources.
The Fix: I’ve dealt with this recently when Cloudflare updated their bot detection. Simply adding a User-Agent and Accept-Language header to your HttpClient often allows the request to pass the initial bot "sniff test."
Apples to Oranges Comparison
Comparing a browser to a looping HttpClient is a bit lopsided. A browser typically fetches a page and its assets, then stops. Your app is likely hitting the server in a tight loop for different tickers—something a human can't do. Even if you held down Ctrl+F5 in a browser, the browser manages TLS session reuse and connection pooling more gracefully than a low-level library.
Suggested Path Forward:
Refactor to Async/Await: Switch to Await Client.GetStringAsync(Url) to fix the threading and exception handling.
Mimic a Browser: Add a standard User-Agent header to your HttpClient instance.
Introduce Throttling: Add a small delay (e.g., 1 second) between calls. If the 10054 errors stop, you’ve confirmed it’s a rate-limiting issue.
Check for APIs: It might be worth checking if the site offers a legitimate service endpoint, which is usually more stable than scraping HTML.