.NET Megathread 3.5: await GetGoodPosts()

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > .NET Megathread 3.5: await GetGoodPosts()

«‹›2 »

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

WHERE MY HAT IS AT posted:

Pretty sure the thread title should be await GetGoodPostsAsync()

async/await introduces just as many issues as it solves imo

# ¿ Jun 20, 2014 22:50

Adbot: ADBOT LOVES YOU

# ¿ May 2, 2024 03:36

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Stupid things like configure await and having to manually spill variables (!!!!) to reduce allocations is just tedious

Also it conflates a CPS rewriting sugar (await) with parallelism which honestly is more confusing. Something along the lines of haskell async or parallel might have been cleaner

I just don't think it's at all clear how async await works behind the scenes and how much control you have over it: the actual generated code is not even CPS it's a funky state machine with a bunch of gotchas.

It needs some nicer combinators and good spec of its semantics

# ¿ Jun 21, 2014 00:59

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Gul Banana posted:

The 'easy fix' is typically to ConfigureAwait(false) so that you don't get marshalled onto a blocked context. however it's irritating and gives up many of the actual benefits of async.

The 'right fix' is that async code should be top-down, not bottom-up. The requirement to use async apis is an asynchronous event loop. Such as: WPF event-handing "async void"s, or ASP.NET actions. if you don't have one of those available, you have to return Task yourself; it can't be safely encapsulated. People get in trouble when they try to use async methods *internally* to implement a non-async thing. Even given that constraint, await is pretty fantastic.

They tried to get monads in the language with linq and it was ok but I have rarely seen generators (and the difference between Ienumerator and ienumerable is subtle and almost always you actually want the former)

Now they've gotten a monad that isn't really a container and in order to sneak it in without the ability to have do-notation they put in await which manually performs CPS and then optimizes it out by using a state machine analogous to yield return

But manually implemented generators are extremely rare--everyone usually uses the combinators from linq instead and uses icollection as the base

In this case everyone has to deal with the gotchas of implemented async generators via await since the libraries aren't rich enough and there isn't enough power to have monad manipulation defined generically

And then there's synch context which is really goddamned subtle and even Jon loving skeet says he doesn't really understand it so how do you expect workaday devs to diagnose why deadlocks are happening in something they were sold as the best asynchronous pattern?

I wanna say if they dropped the async void compat layer and made it so that async behaved more like Haskell async by using speculative parallelism it might be better

# ¿ Jun 21, 2014 13:09

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Gul Banana posted:

so with List and Future you have maybe 90%, 99% of the monadic power that Joe Applicationdeveloper can really use. each gets its own special case syntax. for that matter so does IO, which has "the assignment operator". Microsoft's implicit argument here is that by addressing each of these use cases as its EROI becomes high enough, they're frogmarching the world forward as fast as it will grumbly-shuffle. until someone comes up with a piece of software that doesn't need to be maintained, idk that we have proof otherwise.

well clearly not, since they're adding the chained null thing which is the Maybe/Option monad

is there anything against dumping classes and only using nullable structs?

Gul Banana posted:

so with List and Future you have maybe 90%, 99% of the monadic power that Joe Applicationdeveloper can really use. each gets its own special case syntax. for that matter so does IO, which has "the assignment operator". Microsoft's implicit argument here is that by addressing each of these use cases as its EROI becomes high enough, they're frogmarching the world forward as fast as it will grumbly-shuffle. until someone comes up with a piece of software that doesn't need to be maintained, idk that we have proof otherwise.

Try using both at once. It's a clusterfuck and you actually need monad transformers to deal with it properly.

They basically tried to hide the theoretical complexity but in doing so opened a giant can of worms. Explict > implicit and unless there is a very clear way of translating from implicit to explicit ( do-notation is basically just (>>=) and lambda-binding) there's gonna be problems

F# had async workflows, which are really nice and much better for a lot of async stuff than the TAP :http://tomasp.net/blog/csharp-async-gotchas.aspx/

lcw014 or w/e get in here and explain yourself

# ¿ Jun 21, 2014 21:05

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

redleader posted:

Any chance you could elaborate on this?

code:

var x = Enumerable.Range(1,10)

foreach(int i in x.Take(5)) { Console.WriteLine(i);}
foreach(int i in x.Take(5)) { Console.WriteLine(i);}

This prints 1..5 and 1..5 as opposed to 1..5 and then 6..10 because IEnumerable is a factory for IEnumerator, which carries the state (and needs to be pumped manually via MoveNext).

I really am not sure why they chose having the factory being the primary part of the foreach loop and LINQ but it means that it's a pain in the rear end to handle delimited chunking efficiently (or as far as I can tell at least)

# ¿ Jun 22, 2014 12:47

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Volte posted:

I'm not even sure in what kind of nightmare land you'd want iterating over the same collection twice in a row to print two different results. IEnumerator is the iterator for IEnumerable. Iterators are mutable even when the enumerable isn't, which is why it makes no sense to expect any sort of consumer state to be encoded in the enumerable. foreach creates a new iterator (IEnumerator). If you want to use the same enumerator across multiple loops (i.e. for chunking), create it yourself before the loops and use the MoveNext method manually.

edit: also for LINQ it wouldn't be that hard to write a collection wrapper that does chunking for you implementing IEnumerable.

The factory-ness of ienumerable is fine, it's just that it doesn't make a lot of sense to iterate over a factory. I would rather you have to call GetEnumerator and then have to pass that enumerator to the foreach loop, to make the forcing explicit.

The point is that the iterator encodes the current state. If you want to get another view of the collection, you call GetEnumerator which makes the "Reset" explicit.

And then you make collections immutable like they ought to be :getin:

# ¿ Jun 22, 2014 15:44

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

ljw1004 posted:

I'm happy to talk about async design! Thanks for the comments, Malcolm and others. I love the chance to talk about language design. Looking back, I think we made the right calls at the time, and right now I'd make mostly the same calls again...

ConfigureAwait? There's a straightforward choice: either optimize the design for app authors (by returning to the original thread/context after an await), or optimize the design for library authors (by continuing after the await on whichever thread/context you happen to be). We chose to make things easier for the app authors, because there are more of them and because they're less expert.

Within the .NET framework, they use .ConfigureAwait(false) everywhere. Using the new "Roslyn Diagnostics" feature (as in Dev14 CTP1) you can add custom rules to your project, including e.g. enforcing that every await has a .ConfigureAwait() after it. Within VS, actually, they don't enforce this... the people there came up with a different pattern they think is more robust. They're still refining it. Maybe in a month or so I'll be able to post what they've done.

Manually spill variables to reduce allocations? There's a straightforward choice: either allow await in arbitrary places e.g. "using (var x = await GetFileAsync())" and deal with the consequences, or do what F# did and only allow it in two places "do!" and "let!". We chose to allow it everywhere because this was the biggest complaint in F#. We can still optimize the compiler further to make it more efficient when you use await in weird places, e.g. "myStructArray[await i].f(await j)", and indeed we've already made improvements in await codegen for Dev14 CTP1, but it's a case of diminishing returns.

Is it tedious to write awaits in a way that don't involve spilling? Not really. You can still write "var x = await <expr>" and "await <expr>" and "using (var x = await <expr>)". It really is only the more esoteric code patterns, ones that usually don't pass code review, that involve spilling.

Conflate CPS with parallelism makes it more confusing? I strongly disagree. The whole world has been on a stupid mistaken bender over the past decades, stemming from the mistaken belief that asynchrony means multiple threads. I know... I suffered through it for my undergrad, PhD and postdoc, and saw countless research careers stupidly wasted on this misconception. About half the people still don't get it, and write bad (inefficient) code as a consequence.

Look, if you're in a restaurant and need to serve two tables, do you hire TWO waiters? No. You use the same one cooperatively for the two tables. It's an easy enough concept in real life. It should be easy in code. It will be easy to people who are brought up thinking this way (including node.js developers!)

The semantics of await are precisely "CPS with a one-shot continuation", no more, no less. The "funky state machine" is as funky as it needs to be to optimize performance, no more, no less. The only gotcha I can think of is if you write a custom awaiter that's a mutable struct, but mutable structs are always dodgy.

Here are two blogs I've written with more information about "behind-the-scenes". They're on my blog not part of official cleaned-up MSDN documentation because honestly they're needed by almost no one except Jon Skeet to set brainteasers about.
http://blogs.msdn.com/b/lucian/archive/2012/12/12/how-to-write-a-custom-awaiter.aspx
http://blogs.msdn.com/b/lucian/archive/2013/11/12/talk-async-codegen.aspx

Combinators? Here I think you're way off the mark You should compare "await" to "callbacks". Awaits are COMPOSITIONAL with respect to the other operators of the language in a way that callbacks are not. The word "compositional" comes from computer science theory... what it boils down to in practice is that with callbacks you can't continue to use your familiar try/catch blocks, or while loops, or for loops, or even the goddam SEMICOLON operator. You have to figure out other weird ways to encode them. Callbacks are not compositional with respect to most of the language operators. By contrast, await is compositional with respect to them.

As for the combinators? They're all present in Task.WhenAll / Task.WhenAny / Task.Delay / ... Indeed the whole great thing about async/await is that, through the Task type, it has such a plethora of great combinators! TPL Dataflow! All of them!

As for a spec of semantics? With Neal Gafter's help (of Java fame) I wrote the VB language spec for async+await and I think I did a pretty good job. Mads wrote the C# spec for it and also did a pretty good job. Please let us know if you think it's underspecified.

What you're seeing is an underlying "gotcha", that async+await makes more manifest. In the case of a console app, I personally just do this:
code:
Module Module1
   Sub Main()
      MainAsync().GetAwaiter().GetResult()
   End Sub

   Async Function MainAsync() As Task
      ... write my code here
   End Function
End Module
That's easy, but doesn't have the nice "single-threaded" guarantee you normally get as an app author. If you want that, then you'll have to create your own message-loop as described here:
http://blogs.msdn.com/b/pfxteam/archive/2012/04/13/10293638.aspx

We discussed whether or not to make async console apps easier to write, by providing that message-loop by default, but there's no good consensus on how exactly it should behave, so we didn't.

Libraries should not generally go around offering both sync and async versions of their APIs, and indeed most don't...
http://channel9.msdn.com/Series/Three-Essential-Tips-for-Async/Async-Library-Methods-Shouldn-t-Lie

Exposing monads to programmers never makes their lives easier! No matter how many times Erik Meijer says it! Same goes for the rest of category theory! (I spent many hours struggling through Benjamin Pierce's book "Category Theory Made Simple So That Even Dumb Computer Scientists Can Understand It", and attended category theory seminars at college, and roomed with a category theory PhD friend, and I still don't buy it...)

It's really not hard to avoid deadlocks. Just stop using .Wait() and .Result. Heck, write a Dev14 CTP1 analyzer plugin to enforce this if you want!

Deadlocks aren't a significant problem in the wild. In my opinion the more significant problems are (1) re-entrancy bugs, (2) people failing to understand the difference between CPU- and IO-bound code.

Speculative parallelism? A huge dead end, based on the misconception that having lots of your codebase be multicore-able is somehow worthwhile. Turns out it's not. We also pursued other similar approaches, e.g. doing memory heap shape analysis to discover which methods can be executed in parallel. Wasted a good dev for a whole year on it.

In the end, you actually get the bulk of your parallelization benefit from just the small computational inner-loops in your code, the ones that iterate over large datasets. And these are best expressed in domain-specific ways and coded by experts, e.g. PLINQ, or calling into a lock-free sorting algorithm, or similar. Trying to put multithreading (implicit or explicit or "hinted at") into the rest of your code gives negligible performance benefits, but at huge cost in terms of bugs and race conditions and mental complexity. Not worth pursuing.

We started from F# async workflows, fixed up the chief complaints with them, aggressively improved their performance to within an inch of their lives, and then made the concessions needed to bring async into the mainstream. VB/C# async is the result.

From Tomas' blog...

Gotcha #1: This is the point that C# async methods don't yield control until they hit their first not-yet-completed await, while F# async methods yield immediately. We did this very deliberately because it makes await vastly cheaper:
http://channel9.msdn.com/Series/Three-Essential-Tips-for-Async/Async-libraries-APIs-should-be-chunky

In any case, both the F# and the C# code are ugly for mixing blocking and async stuff in them. This isn't the way anyone should be writing code, and Tomas is wrong to call it out as a benefit of F#.

Gotcha #2: Out of date. The C# and VB compilers both warn about the failure to await.

Gotcha #3: The fact that we allow async void. Tomas trumpets it as an achievement that you have to write 8 extra lines of pointless boilerplate to use async in real-world code. I don't think that's an advantage. The right solution is (as we did) allow async void as a painful concession to back-compaq. On a project-by-project basis you might want to opt in to more aggressive compile-time warnings about it; that's what Dev14 CTP1 diagnostics are all about.

Gotcha #4 and #5: The problem that in C# an async lambda might be a void-returning async, if it's passed to a parameter of type Action. Yes this is a shame, but it's an unavoidable consequence of earlier design decisions in C#. And in any case, like he shows with Task.Run instead of TaskFactory.StartNew, the solution is always simply to use the more modern APIs whose overloads are designed to be async-friendly. (alas this doesn't work with WinRT which isn't technically able to have such async-friendly overloads for irritating reasons).

Gotcha #6: The fact that we used the Task type, which has blocking Wait method on it. Sure. But the advantages were huge of buying into the whole Task ecosystem, and its many powerful combinators, and these benefits were TOTALLY worth it.

My main issue is that this should all be automated, by default. I dont have the bandwidth to keep up w/ the latest in async/await (this is like the 3rd time we tried this right?) AND also update all my code.

Like it's a huge investment to asyncify existing code.

As far as "write your own roslyn diagnostic" that's basically the equivalent of "write your own distributed mapreduce function in erlang" : a glib gently caress-you, not that you intended it as such. Best practices must be embedded within the tools y'all provide! It's why resharper annoys the poo poo out of me--I pay $14k for VS ultimate and intellisense can't compete w/ a plugin done by some dudes who have no access to the compiler or platform.

Ok now to some specific stuff:

1) the mystruct[await foo].bar(await baz) issue is basically the thing applicative monads solve and with the joinad extension it's very naturally done (as with idiom brackets);
having do! and let! and use! (fwiw the use/use! is significantly nicer than the using statement in C# since it doesnt force you to nest afaik) is pretty ok otherwise though
2) Category theory is just a framework for naming stuff. I have a degree in math and i do not give a poo poo about higher categories but i do care about 3-4: Functor (can you run Select in a context?); Applicative (Can you lift a value & tuple them inside a context) and Monad (can you thread results in a context?) + Traversable (not category theory!) which gives you extremely nice internal iterators

F# got it right in this context w/ workflows.

I certainly do not care for exposing natural transformations and commutative diagrams but the CLR is simply unable to express even the Functor pattern.

3) The speculative parallelism ends up being nice because although one would never hire a waitress per table it's really really nice to work as if you had one waitress per table! Cooperative multitasking sucks to work through, I mean The really slick part of the async package is that it allows neat ways of creating sparks (lazy futures) that are multiplexed onto Haskell green threads that are multiplexed onto OS threads which are multiplexed onto cores by the kernel as needed. This means that you can spawn a bazillion sparks and not really care about running out of memory.

4) Maybe roslyn will solve some of these problems but that's like a year away and even though cadence has stepped up to quarterly I'm surprised that a lot of these diagnostics/analyses were not shipped when async/await shipped

If Simon Marlowe was not heavily involved w/ async await I will be very disappointed, that guy figured out a very natural pattern to handle multicore scaling + io multiplexing and FB poached him.

# ¿ Jun 26, 2014 12:55

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

code:

  'About 2000 lines of if/then, case statements, value lookups, and translations, most of which look like:
  If drThinger("value")="other value" then
   sbThinger.AppendLine("<Thinger>whatever value=othervalue means</Thinger>")
  End If
   'Repeat 80,000 times

what the gently caress

2000 lines/ 80k times? Refactor that poo poo, stat.

Unless you're being hyperbolic but even then goddamn

Also use xmlwriter or schematize it and reflect it into a class via xsd.exe. But there be dragons.

# ¿ Jun 27, 2014 10:45

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Scaramouche posted:

Thanks guys, that's basically what I did:
code:
Protected Sub MakeXML2(dtThinger as Datatable)

Dim strFileName As String = "c:\feeds\outbox\" & Left(System.Guid.NewGuid.ToString, 8) & "_POST_PRODUCT_DATA_" & ".xml"
Dim fsX As New FileStream(strFileName, FileMode.Create)

Using sbThinger As New StreamWriter(fsX)
 sbThinger.WriteLine("<?xml version=""1.0"" ?>") ' header etc.
 For Each drThinger As DataRow In dtThinger.Rows
  If drThinger("value")="other value" then
   sbThinger.WriteLine("<Thinger>whatever value=othervalue means</Thinger>")
  End If
 Next
 sbThinger.WriteLine("stuff") ' footer etc.
End Using

PostXMLDocument2(strFilename)

End Sub

Protected Sub PostXMLDocument2
 'Don't load/save xml file from string anymore, go straight to existing file

 Dim buffer As Byte() = File.ReadAllBytes(strFileName)
 'connect, authenticate, upload, etc.
End Sub
So far, works fine.
Pros:
- Entire string is never in memory at once while building it
- Not passing string between subs any more; the only time it's fully loaded is when I'm getting it from drive
- Generated File size is actually about 15% smaller for some reason (I think from white space introduced by LoadXML/SaveXML)
- Faster; for the 94mb file it actually takes longer to upload than create now

Cons:
- String isn't loaded into XML doc anymore for quick/dirty validation check
- Drive thrashing? I have no idea if there'll be eventual performance concerns of going to drive iteratively instead of all at once

Those 2000 lines don't 'live' in the MakeXML sub, about 90% are in external abstracted functions (e.g. GetColor, GetLength, GetBrand, that kind of thing). I'm parsing data from about 30 different suppliers, and need to convert their values into numeric codes. E.g. "Red" = 87550701, with the resulting XML being "<Color>87550701</Color>". Except I have to parse "Red","Cherry","Blood","Redd",etc.

I realize the 'real' solution is make a rockin hardcore XSD translation but man, this data is so all over the place it's currently 'easier' to add cases to GetColor.

Thanks again guys! Why this forum rocks as usual.

why are you writing to a file on disk first (do you need to keep it around?). you could just stream it all the way out to the connection (if you must keep the file around and read from it at least stream it out via file.openread)

# ¿ Jun 27, 2014 20:46

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Geisladisk posted:

Does anyone have any tips for streamlining Windows Service development?

I started working on a large, mature system two months ago, which runs on a fairly baffling ecosystem of virtual machines. Most of these VMs are running windows services, which I need to develop.

I've got batch scripts that, whenever I recompile a particular service, automatically uninstalls it, copies the new files to the VM, reinstalls the service, and then starts it. I then attach the debugger to the process running on the VM if I need to debug. This is all fairly nice, but the process still feels very unweildy, and compiling and running the code takes up to a minute, which is frustrating me more than it should.

Are there any ways to streamline this process that I haven't noticed? I'm running visual studio 2012 pro, but I can get a upgrade to 2013 pro if needed.

Chef, puppet and Powershell DSC

# ¿ Jul 22, 2014 20:22

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

wwb posted:

This is why one should not use knockout if one is trying to learn web stuff. Knockout is great but the user base is pretty much recovering wpf/silverlight dudes and some extension into .NET only devs. It don't get much pickup outside of the .NET world.

that said MVVM is very good at solving a very common pattern

I suspect you could use react with knockout.

# ¿ Jul 28, 2014 17:04

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Essential posted:

Thank you for the suggestion Lucian! Is there a similar async thing I can do with the 4.0 framework? We have to support Win XP machines so I cannot get above 4.0 framework

Microsoft.bcl.async

# ¿ Aug 9, 2014 20:46

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Newf posted:

It's VS 2010, but I'm debugging in x86 and I'm able to edit and continue if I break the running process - it just won't let me do any typing (editing) while the process is running.

This would break source mapping if you e.g. deleted a method with a breakpoint in it.

I don't think there's a workaround-it's very annoying though

# ¿ Aug 28, 2014 14:15

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Uziel posted:

Oh. Does my code have to be open source in order to take advantage of it? Its for an internal tool, not software that is sold.

If you are using ext for the View, how did you get around duplicate models?

Yeah, its a Google calendar style event scheduler that is from the Ext.NET team and the primary reason we went with that over ext js:
http://examples1.ext.net/#/Calendar/Overview/Basic/

Talk to legal. Are you distributing it? "Distributing" software is a legally ambiguous term.

# ¿ Sep 9, 2014 13:26

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

GrumpyDoctor posted:

This is the problem that generics solved, but the collection you're iterating over doesn't have a generic interface, which is why var's type inference isn't working.

Var's type inference is working correctly; the compiler is unable to deduce the specific type since that information was erased because of collection not exposing it.

# ¿ Sep 18, 2014 18:00

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

ljw1004 posted:

As someone who works in "big data" this is just standard statistics and experimental design, the kinds of stuff that has been used for decades and is still useful in 99.9% of cases. If your data fit into memory on a commodity machine (i.e. 1TB or less) then it isn't big, at least that's our criterion.

As ljw said what you want is a data driven system where you can take a question definition and project it through a view (e.g., for the ones you showed, maybe have a MultipleChoiceView that takes a Question and Answers Properties with maybe a MultipleChoicePictureView that supports pictures as answers if needed etc)

epalm posted:

If my test project does a pretty good job of testing my services and visiting most code paths, but uses a database instead of mocking out repositories, does that make me a bad person?

For things that must be unique, like product names for example, I have to do stuff like product.Name = Util.RandomString(length: 20);

But the idea of creating like 40 mock repositories just sounds like a monumental amount of work.

Code to an interface, not an implementation, and allow a constructor overload that accepts that interface. Done. That's dependency injection.

I spent too drat long realizing that's all you ever need. Any containers or service locators or poo poo like that is just sugar to make it easy to wire up dependencies.

# ¿ Sep 28, 2014 19:05

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Schmerm posted:

Say I want to host managed CLR code in my C++ application. I've got some managed DLL assemblies, and from my C++ code, I want to be able to call managed methods and poke global/static structs. How do I go about doing this? Do I have to use COM? I can't seem to find any APIs on MSDN that do these things.

I know there's C++/CX or C++/CLI or whatever it's called, and I'm not against making wrappers written in this, as long as they can be called from normal unmanaged native C++ code.

I've been playing with Mono, and it has very straightforward APIs: create app domain, load .dll, get class by name, get method by name, and then call it. I'd figure MS would have made something even more elegant, and faster. Plus, I want to be able to debug the whole combined managed/unmanaged mess from Visual Studio.

Host the CLR via COM.

It's a pain. Going the other way via P/invoke / COM Interop is much nicer so it may be worth inverting your config.

Or do generic IPC via message passing.

# ¿ Oct 2, 2014 14:28

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Bognar posted:

Where is it putting the order by in the query and what does the query look like in code? I've never run into an ORDER BY statement being generated if not asked for.

This is why statement mappers are generally better than ORMs. Total control yet still convenient.

# ¿ Oct 11, 2014 23:54

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Mr Shiny Pants posted:

How do I use stream.CopyToAsync(stream) in F#?

I use the following, it is what I got working:

code:

[<Literal>]
let source = @"C:\Origin\"
[<Literal>]
let destination = @"C:\Destination\"

let cancellationSource = new CancellationTokenSource()
 
let CopyFile file=
    async{
          printfn "Starting copy of: %s" file
          let buffer = Array.zeroCreate 4096
          use reader = new BufferedStream(new FileStream(file,FileMode.Open,FileAccess.Read))
          use writer = new BufferedStream(new FileStream(destination + Path.GetFileName(file),FileMode.Create,FileAccess.Write))
          let finished = ref false
          while not finished.Value do
            let! count = reader.AsyncRead(buffer,0,4096)
            do! writer.AsyncWrite(buffer,0,count)
	    finished := count <= 0	    
          reader.Flush()
          printfn "Copied: %s" file
          }

let files = 
    Directory.EnumerateFiles(source)
    |> Seq.iter(fun x -> Async.Start(CopyFile x,cancellationSource.Token))    

printfn "So sleepy......"  
Thread.Sleep(1000)
printfn "Cancelling workflows"
cancellationSource.Cancel()

It works pretty awesome, but the standard streams already support cancellation so this looks redundant to me.

F# async and C# async are different, totally

# ¿ Nov 10, 2014 21:53

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Please use attribute based routing and have your controllers use reasonable method names.

# ¿ Jan 15, 2015 22:05

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

RICHUNCLEPENNYBAGS posted:

Yeah, instead, just an extra opportunity to gently caress up naming it.

as opposed to having it implicitly generated by some weird convention based thing that is impossible to control?

# ¿ Jan 16, 2015 12:22

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Inverness posted:

There's an update to the .NET Framework blog about the status of open source.

I'm surprised that they've only moved 25% of things to GitHub so far.

I'm curious about what exactly is involved in moving those libraries and the CLR repository to GitHub that is consuming their time.

there is a 30+ step process to open source things at microsoft, for good reasons.

You don't want some patent/licensing agreement to bite you in the rear end a decade after it was signed.

# ¿ Jan 30, 2015 19:49

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

ljw1004 posted:

I think that 350k lines of code in 37 workdays is pretty darned impressive! (assuming reasonable time off for Thanksgiving and Christmas).

I asked Immo for more details on what's taking the time. He says he touched on it a bit in this part of his recent video interview
http://channel9.msdn.com/Series/NET-Framework/Immo-Landwerth-and-David-Kean-Open-sourcing-the-NET-Framework#time=11m6s
but I'm hoping he'll blog more about the process in detail.

Yeah for real. It's not a trivial process to even open source code within the drat company, let alone take a large component and clear it for public consumption. Kudos to everyone involved.

Did you know, for example, that "prd" is a Czech word for fart and could be offensive? It took down an internal portal for a day since it had prd in the URL and didn't pass some extended check.

# ¿ Jan 30, 2015 22:51

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Captain Capacitor posted:

I know this isn't entirely on topic but if anyone has any gross issues with Azure I'm finally in a position to do something about it. Drop me a PM and I'll do my best to help out.

have u fixed the random reboots in the fabric that are unannounced and really weird

also do yall have a release process in place so one dude cant gently caress up multiple regions

it woild have been nice to have proper diagnostics for tables

did they fix the issue where RDFE has a dirty word filter preventing previously existing names from being recreated

# ¿ Feb 4, 2015 21:48

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Captain Capacitor posted:

I know this isn't entirely on topic but if anyone has any gross issues with Azure I'm finally in a position to do something about it. Drop me a PM and I'll do my best to help out.

Also why is cerebrata's dogshit management studio eons better than any of the portals

# ¿ Feb 4, 2015 21:49

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

ljw1004 posted:

Just to say, I only learnt about "Azure Webjobs" last week, and I'm OVERJOYED!!!

Seriously, my number one complaint about azure used to be worker roles. They took ten minutes to deploy. It was hard to get logging out of them. And trying to write something like an algorithm for "guarantee execution at least once a day" required distributed mutexes and idempotency and stuff -- way beyond my abilities, even though I did my PhD in distributed coding.

Azure Webjobs make all that so easy. And they're free for jobs that run no more than once an hour. What a joy.

we used blobs as a ghetto distributed lock and then orleans b/c winfab is still not ready for prime time, or something

# ¿ Feb 4, 2015 21:50

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Captain Capacitor posted:

I know this isn't entirely on topic but if anyone has any gross issues with Azure I'm finally in a position to do something about it. Drop me a PM and I'll do my best to help out.

why is there a hard limit on storage accounts per subscription

# ¿ Feb 4, 2015 21:53

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Captain Capacitor posted:

This is what a good chunk of my team is working on right now.

The whip has been cracked from on high on this one.

These two I'm not sure about, but I'll ask around.

cool i Pm'd u with more details on stuff since my old team either broke azure or was broken by azure on like a weekly basis

# ¿ Feb 5, 2015 01:58

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

JawnV6 posted:

I have an application deployed that uses HttpWebRequest to send a request to a sever. They're seeing an unexpected error, "Method not found: 'System.Net.HttpWebRequest". It appears to be in every .net version from 1.0 to 4.5. All of the error pages google brings up are dealing with some old Mono version missing this. How does an installation miss out on a chunk like this?

The entire code using this class:
code:
HttpWebRequest hwr = WebRequest.CreateHttp(url);
HttpWebResponse response;
try
{
            response = (HttpWebResponse)hwr.GetResponse();
            Stream receiveStream = response.GetResponseStream();
Not really sure what to do. I need to hit a particular URL with a user-supplied string, check the response and display a Pass/Fail. Is there another method to doing this that's more available?

install http client libraries from nuget and use those https://www.nuget.org/packages/Microsoft.Net.Http

# ¿ Feb 5, 2015 02:01

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

JawnV6 posted:

It's unclear to me how grabbing something else in an extra complicated way is going to make that new thing available on a remote computer I have zero control over and is already missing core functionality? Am I missing some bit about how this works?

how are you deploying code to a server you have no control over? and who pissed in your cornflakes? nuget is standard for component delivery on .NET. clearly the application is linked against HWR or you'd fail at binding time as opposed to run time so fix your broken deployment by either sidestepping HWR or not using mono

# ¿ Feb 5, 2015 02:18

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Ithaqua posted:

http://blogs.msdn.com/b/visualstudioalm/archive/2015/02/12/build-futures.aspx

Blog on the VSO build stuff that's coming... I was talking about it a few days ago, I guess they're finally ready to start publicizing it a bit.

loving finally. being ordered to move to tfs sans a decent ci system was horrible

# ¿ Feb 14, 2015 01:38

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Ithaqua posted:

Read Roy Osherove's book The Art of Unit Testing. That's first and foremost.

IMO there should be no such thing as a "test developer". Everyone should be writing unit tests while they write their code. The tests should be considered first-class citizens and live in the same solution as the code they're testing. If you're doing true unit testing, the VS test runner should be configured to run tests after every compilation -- this goes for everyone. If a test breaks, fix the test right away. It doesn't matter who wrote the test that's failing. Testing is all about shortening the feedback loop and identifying and correcting bugs faster. It also helps inform the design of your application -- something that's hard to test is probably badly designed and needs to be refactored. Siloing "developer" and "test developer" is counter to the entire idea of shortening the feedback loop and informing design.

The fastest way to have a failed automated testing adoption is to let the tests be second-class citizens that only one or two people thinks about or maintains. The tests stagnate, and eventually they start to fail. By the time anyone gives a poo poo and goes to look at the tests, they have an awful time figuring out whether the tests are invalid (not testing the right things) or evidence of a bug. Then everything gets thrown out, and a year later someone takes another stab at the whole "testing thing", with exactly the same results.

Doing it right requires a culture change, plain and simple.

Bingo. This is the attitude to have.

Testing is something intrinsic to developing good code. Ultimately as a test lead I would hope you keep everyone else in line when it comes to testing their changes

# ¿ Mar 8, 2015 04:17

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

ljw1004 posted:

Summary: You can now write a single project and have it run on multiple devices (before you had to have a different project for each device.) It's now easier to do form-factor-adaptive UIs too.

I've been overseeing the technical side of things for .NET apps. I can't wait to say more about it all, but for that I have to wait until next Wednesday when things will be announced at Microsoft's Build conference. (We released a kind of intermediate "point-in-time" version of the tools last month, but I'd get too confused if I tried to describe that as well as the final way things work...)

Unfortunately no one actually cares about windows phone or tablet so idk why they kept pushing universal apps so hard

The cross platform stuff is way more useful

# ¿ Apr 23, 2015 11:16

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

aBagorn posted:

Ok I hope I'm wording this right, but I want to figure out how to do something, and I'm really dumb when it comes to async/await stuff in general.

Currently we are processing csv files (that could contain upwards of 5 million rows) and populating the results into objects that will eventually live in the database. The methods are pretty convoluted and involve multiple steps, which all have to be done in order (for now). Some samples of the code I've inherited below. I feel like it belongs in the coding horrors thread.
C# code:
:words:
So the process basically flows file -> list of rawDataObjects -> foreach loop to make list of rawDataRow objects -> foreach loop to transform rawDataRow to dbObjects and save them via EF in batches.

I don't have too much leeway to completely gut everything (i.e., we save those rawData objects to their own table at one point and FK the dbObjects to them) so I can't really skip any of the steps.

What I'd like to do, however, is potentially run the first foreach loop until I hit some arbitrary number of rawDataRow objects (say 50k) and immediately kickoff the foreach(var rawDataRow in rawDataRowList) loop with that set while the next 50k rawDatas get transformed into rawDataRows.

This should be possible, right?

ugh

use a merge into, generate the csv row -> table row client side and just do fire that poo poo off in one query, it'll be one big ol req and you can even have it log dupes ("WHEN MATCHED THEN fart")

batching only makes sense if csvrow->dbobject is expensive so you can parallelize the work otherwise you are io bound anyway

this looks like it's batching something in 30k chunks already

aBagorn posted:

I'm looking to have the overall job to complete in a shorter time. We're running this in Azure so shorter processing times = less $$$

(I figured I didn't know what I was talking about)

As far as resources go, I cranked up the VM this service was running on to a D13 (8 cores 56GB RAM) and we're 3 days into processing a file with 5 million records, which is unacceptable to the client.

I have another strong feeling that reaching out to the DB (which lives on another server) during the foreach loops is also potentially a root cause of the problem, as well as improper management of the EF Context

jesus christ

# ¿ Jun 9, 2015 22:27

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

also you are materializing the stream into lists, so you have a shitload of rows in memory

you probs want to stream ienumerables and manipulate using LINQ

# ¿ Jun 9, 2015 22:30

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

gariig posted:

5 million records isn't much but how you do it can make a big difference. My first suggestion would be to look into SQL Server Integration Services (SSIS) which is Microsoft's ETL (Extract-Transform-Load) software. It's meant to do this kind of work. When I was doing this kind of work 5 million records would probably be 10-15 minutes if the SSIS package was designed correctly.

My next suggestion is can you increase your data locality by caching the database in memory on the server? Going across the network is going to kill your performance especially if you are reaching across the Internet over a VPN. Even a 5ms round trip call (SQL Server on a LAN) for 5 million records is 6.9 hours of waiting on your network. Not counting the call to insert everything (another 6.9 hours). I haven't done much EF but it doesn't seem to be built to load millions of records.

if this is creating a new req for each row i can see how it's taking days (!!!) unless constructing a dbobject from a row is super duper expensive

# ¿ Jun 9, 2015 22:31

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

Bognar posted:

I 100% guarantee you that this is your problem. Find a way to look for duplicates that doesn't involve a repeated database call. I don't know what your duplicate checking logic is like, but there's a good chance it can be done outside of the DB. Or you can try to do it all inside the DB, just for the love of god don't split it across two machines.

This is probably a waste. There's a significant chance that you're latency bound, which means you're not burning CPU. Have you checked the load average on the VM?

Also, just for giggles, isn't this just if (count == 30000)?

loading data into sql server with dupe detection can and should be done in 1-2 statements (create temp or CTE on a merge into and log matches)

god knows what's going on inside EF to generate the row tuple.

# ¿ Jun 10, 2015 17:47

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

aBagorn posted:

So I think I'm going to make a recommendation to ditch EF if at all possible.

This service was written before I got here and dealt with files that contained at most a few thousand rows, and the fact that it was not designed to scale is showing.

The only problem I forsee is that the original dev did things like this with EF relationships to the dbObject before inserting.
C# code:
dbObject.dbObjectOwner = new dbObjectOwner(); 

dbObject.ListOfThingsRelated = ThingListCreatedBefore;
And inserts with foreign keys and multiple join tables for many to many relationships.

I started looking into BULK insert but it seems like it's going to be multiple steps, especially if we are getting away from inserting these fully hydrated EF objects

as much as i dislike ORMs i dont think EF is your (only) issue. at worst you can have EF act as a object->sql statement creator

but that isn't all that slow even if you have each object insert in its own transaction (5mil transactions is trivial)

clearly something is deeply hosed in your validation and creation of the object. fix that first.

# ¿ Jun 11, 2015 00:16

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

nuget is rear end and should have been replaced by maven for .net

maven owns and i miss it everytime i have to deal w/ the cliusterfuck that is nuget + msbuild + asp.net json projects

# ¿ Aug 21, 2015 00:30

Adbot: ADBOT LOVES YOU

# ¿ May 2, 2024 03:36

Malcolm XML: Aug 8, 2009; I always knew it would end like ｔｈｉｓ．

RICHUNCLEPENNYBAGS posted:

I haven't used maven but I like nuget well enough. What's wrong with it?

Ithaqua posted:

ASP .NET 5 isn't even out of beta yet, I think it's a bit premature to condemn it.

it gets the wrong thing right: it says how not what

w/ maven i describe what i want and maven deals with the poo poo to get it done. enforce the convention and life is good :yayclod:

w/ nuget/msbuild/json i have to manually describe the build steps and dependencies and there's no real lifecycle concept to slot in tools (except if they manually gently caress w/ the msbuild file!!)

it's a giant PITA to handle nuget outside of vs, the CLI doesn't let you do half the things the PS console can do

# ¿ Aug 21, 2015 22:42

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > .NET Megathread 3.5: await GetGoodPosts()

«‹›2 »