Optimizing memory functions with REP [Hooray! Solved]

Programming, for all ages and all languages.
Post Reply
User avatar
Ycep
Member
Member
Posts: 401
Joined: Mon Dec 28, 2015 11:11 am

Optimizing memory functions with REP [Hooray! Solved]

Post by Ycep »

I have recently tried to optimize my memcpy() by using REP, instead my old loop. It failed. Something is not right there.Image

Please do not respond iansjack if you are trying to insult me (since you are anyways on my foe list so I don't get angry because some 70 year old dog walker with his own website)

Solved! Solved! Solved! Solved! Solved! Solved!
I write it that large and that much times so you won't miss it.
Last edited by Ycep on Wed Nov 16, 2016 2:45 pm, edited 3 times in total.
glauxosdever
Member
Member
Posts: 501
Joined: Wed Jun 17, 2015 9:40 am
Libera.chat IRC: glauxosdever
Location: Athens, Greece

Re: Optimizing memory functions with REP

Post by glauxosdever »

Hi,


For memcpy(), you don't want to combine dword copying with byte copying. Just copy bytes, it's faster and simpler. The Intel Optimisation Manual describes memory functions (section 3.7.6).

As for iansjack, you insulted him numerous times today. I wouldn't be surprised if he specifically responded to this topic just to make you stop insulting him.


Regards,
glauxosdever
Octocontrabass
Member
Member
Posts: 5633
Joined: Mon Mar 25, 2013 7:01 pm

Re: Optimizing memory functions with REP

Post by Octocontrabass »

Don't ask for help with screenshots of code.

What kind of broken assembler doesn't accept hexadecimal constants?

Read the Intel manual's description of the DIV instruction.

As mentioned by glauxosdever, CPUs implementing ERMSB will have higher performance using MOVSB instead of MOVSD. However, you should probably first worry about getting memcpy() to work, and worry about optimizing it later, when you're able to benchmark it.
User avatar
Ycep
Member
Member
Posts: 401
Joined: Mon Dec 28, 2015 11:11 am

Re: Optimizing memory functions with REP

Post by Ycep »

glauxosdever wrote:Hi,


For memcpy(), you don't want to combine dword copying with byte copying. Just copy bytes, it's faster and simpler. The Intel Optimisation Manual describes memory functions (section 3.7.6).

As for iansjack, you insulted him numerous times today. I wouldn't be surprised if he specifically responded to this topic just to make you stop insulting him.


Regards,
glauxosdever
I did never said anything bad to him since now... He first started criticising (ok, he was right) but later called my posts "obscencities" and told me much of insultious sentences. Before that I liked his type how he responds to people. But after time I actually saw his real type... (I am ENTP-T person)

Octocontrabass :
MASM Visual Studio Implementation (Through MASM32 does support). Screenshot code is much more readable than forum .
I already have memcpy(), I'm just optimizing it...
My DIV instruction usage is completely right.
Copying byte per byte has same speed as copying dword per dword and then copying remainders? Wow!
Let's try it...
OMG it's so freaking fast! Glauxosdever, you are wonderful person! Solved!
glauxosdever
Member
Member
Posts: 501
Joined: Wed Jun 17, 2015 9:40 am
Libera.chat IRC: glauxosdever
Location: Athens, Greece

Re: Optimizing memory functions with REP

Post by glauxosdever »

Hi,


Actually the truth is somewhere in the middle. You didn't insult him numerous times, and you didn't insult him for first time in this thread. You insulted him exactly once more:
Lukand wrote:We already know that iansjack just clicked "no" without reading the rest of the post...
Anyway, let's get back to the topic.


Regards,
glauxosdever
Octocontrabass
Member
Member
Posts: 5633
Joined: Mon Mar 25, 2013 7:01 pm

Re: Optimizing memory functions with REP

Post by Octocontrabass »

Lukand wrote:Octocontrabass : MASM Visual Studio Implementation (Through MASM32 does support). Screenshot code is much more readable than forum .
I'm surprised, it seems like such an obvious feature. Maybe it supports the "h" suffix, like "0FFFFFFFCh"?

Screenshots may be more readable to you, but they aren't for everyone. Remember, at least one active user here can't read your screenshots. (Plus, green and orange are far too similar to my eyes.)
Octocontrabass
Member
Member
Posts: 5633
Joined: Mon Mar 25, 2013 7:01 pm

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by Octocontrabass »

Lukand wrote:My DIV instruction usage is completely right.
I don't see you setting EDX to zero. Have you forgotten that "div ebx" will divide the 64-bit value EDX:EAX by EBX, store the quotient in EAX, and store the remainder in EDX?
User avatar
Ycep
Member
Member
Posts: 401
Joined: Mon Dec 28, 2015 11:11 am

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by Ycep »

Octocontrabass wrote:
Lukand wrote:My DIV instruction usage is completely right.
I don't see you setting EDX to zero. Have you forgotten that "div ebx" will divide the 64-bit value EDX:EAX by EBX, store the quotient in EAX, and store the remainder in EDX?
Oh whoops. Hadn't seen that one ;).
Anyways I think that Segment:Offset design is really pointless;
Instead:

Segment*16+Offset=Address

Wouldn't it be better if they do:

Segment*65536+Offset=

I mean then there would not be any overlapping segment, it could access full 4GiB of RAM, it may remove confusion to some people and curiosity why did they put *16. If that IBM guys dranked Hennessy (they have big pays) instead of Serbian rakija (Drinking 1l of it leads to 6‰ permil, which could kill almost every human being) then it would be cool!
User avatar
iansjack
Member
Member
Posts: 4724
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Optimizing memory functions with REP

Post by iansjack »

Octocontrabass wrote:
Lukand wrote:Octocontrabass : MASM Visual Studio Implementation (Through MASM32 does support). Screenshot code is much more readable than forum .
I'm surprised, it seems like such an obvious feature. Maybe it supports the "h" suffix, like "0FFFFFFFCh"?

Screenshots may be more readable to you, but they aren't for everyone. Remember, at least one active user here can't read your screenshots. (Plus, green and orange are far too similar to my eyes.)
MASM does support the h suffix. A gotcha that is easily forgotten is that the constant needs to start with a 0 (or some other digit), for obvious reasons. It would, indeed, be a very broken assembler that didn't allow hexadecimal constants.
User avatar
Roman
Member
Member
Posts: 568
Joined: Thu Mar 27, 2014 3:57 am
Location: Moscow, Russia
Contact:

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by Roman »

AFAIK, iansjack called one of your post an obscenity because you used the f* word in the Java OS thread (later you or a moderator edited the message).
"If you don't fail at least 90 percent of the time, you're not aiming high enough."
- Alan Kay
User avatar
iansjack
Member
Member
Posts: 4724
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by iansjack »

Correct.

Considering the age of posters on this site such language is inappropriate.
evoex
Member
Member
Posts: 103
Joined: Tue Dec 13, 2011 4:11 pm

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by evoex »

I know it's solved, but just one more remark/question about your code: why don't you use a right-shift rather than an explicit div? The latter is not only slower, but in my opinion significantly less readable.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by Schol-R-LEA »

Lukand wrote: I mean then there would not be any overlapping segment, it could access full 4GiB of RAM, it may remove confusion to some people and curiosity why did they put *16. If that IBM guys dranked Hennessy (they have big pays) instead of Serbian rakija (Drinking 1l of it leads to 6‰ permil, which could kill almost every human being) then it would be cool!
Here comes the "actually..." for today: First it wasn't IBM who chose segmented memory for the 8086, it was Intel. IBM had made that mistake once before (with the original 360) and probably wouldn't have done it again if it were up to them. The reason that Intel did such a silly thing was because they wanted to make it easy for 8080 assembly programmers to re-write their code for the new chip, which they saw as just a temporary design meant solely for microcontrollers anyway (because home computers weren't here to stay, and the emerging workstation market would surely want something more powerful such as their spiffy new i432 design - you know, the one that wouldn't hit the market until 1983 and didn't wore right even when it did).

Second, it wasn't IBM that chose the 8086, it was Bill Gates and Gary Kildall (and some awkward timing) that pushed them to it. The Boca Raton design group, who were kind of on the outs with Armonk to begin with (because the management at Big Blue were more interested in quashing home computers for good than in making their own) but were given free rein within their budget constraints, so they did something almost unheard of in IBM circles: they solicited outside advice from other IT companies, specifically Intel, Motorola, Microsoft, and Digital Research.

Initially, their plan was to build another 8080 or Z80 based CP/M system of the type already common in the market, and Intel, who were mainly interested in microcontrollers anyway and didn't want to tie a lot of resources up in producing 8086/8088 chips (which they saw as diverting resources from their 'real' Next Big Thing, the i432), so they recommended that as the course to take. Motorola, naturally, enough, recommended using their own chips, but since the 6800 line was more expensive due to lower sales volumes, and the 68000 was still unproven even after a year on the market, IBM decided on Intel (I don't know if they really considered Zilog a serious option or not; the Z80 was basically a modified 8080, while the Z8000 wasn't much better supported than the 68K was at the time even though it came out before the Motorola chip).

As for the other silicon fabs of the day, TI and Fairchild didn't have any microprocessor yet (AFAIK), National Semi and RCA were on IBM's S*** List for trying break into the minicomputer markets years earlier, and MOS Technologies (Chuck Peddle's company, makers of the 6502, which was used in a huge raft of systems including the Apple ][ and the Atari 2600) had just been bought by Commodore, and both Peddle and Jack Tramiel would sooner have swallowed rat poison than play ball with the Blue Meanies.

They then solicited Digital Research (Gary Kildall's company, who produced CP/M, not to be confused with the Digital Equipment Corporation who built the PDPs and VAXen) and the UCSD p-System team for operating systems, figuring that it would be better to have at least two alternatives for the system, and Microsoft for a BASIC interpreter and an assembler. However, both Kildall and Gates mentioned that the market for 8-bit systems seemed to already be saturated, and brought up the 8086, which (AFAIK) Intel hadn't really discussed for the reasons mentioned earlier. Seeing a chance to leapfrog the competition, the IBM team took the advice despite Intel's misgivings.

It was also around this time the Gates learned that IBM was still looking for bids for alternate operating systems, and as it happened, he knew that another company which Microsoft was negotiating a buyout of, Seattle Computer Products, already had an OS for the the 8086, which they called initially called NDOS but had just renamed SCPDOS. He put in a bid to IBM to have this as their third OS option, and slyly suggested that they could have it ready before Digital Research could modify CPM-86 for the new hardware. This led to IBM signing a deal that had a bit of a sting in its tail: they licensed the re-re-christened MS-DOS under yet another name, PC-DOS, on a non-exclusive basis similar to the one they licensed CP/M-86 and UCSD p-System under. While this was a common practice at the time, it laid the seeds of future PC-compatibles like Compaq, and semi-compatibles such as the DEC Rainbow and the Olivetti M24 (sold in the US as the AT&T 6300), which could run the same OS (if not necessarily the same executable files) as the IBM hardware, and eventually set the stage for Microsoft's rise as more than just a language vendor.

It isn't entirely clear why PC-DOS came to dominate over the better established CP/M-86 and UCSD systems, but the fact that Microsoft took marketing seriously was surely part of it. In the first year, their were several orders of all three, but by the end of 1982, both CP/M and UCSD were basically out of the picture. The old chestnut (which I had repeated myself once only to be roughly corrected) about IBM not selling CP/M because Kildall had blown off a meeting to go sailplaning isn't true; IIRC, there was a delay in getting the contract signed because Kildall was recovering from a sailplane crash which happened several days before the deal was to be finalized, but it didn't actually impact the deal significantly.

But we were talking segments... anyway the TL;DR is, Intel used segments to make it easy to write 8-bit code on their 16-bit CPU, which IBM then used because it seemed like a good idea at the time.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Kazinsal
Member
Member
Posts: 559
Joined: Wed Jul 13, 2011 7:38 pm
Libera.chat IRC: Kazinsal
Location: Vancouver
Contact:

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by Kazinsal »

Schol-R-LEA wrote:It isn't entirely clear why PC-DOS came to dominate over the better established CP/M-86 and UCSD systems, but the fact that Microsoft took marketing seriously was surely part of it. In the first year, their were several orders of all three, but by the end of 1982, both CP/M and UCSD were basically out of the picture. The old chestnut (which I had repeated myself once only to be roughly corrected) about IBM not selling CP/M because Kildall had blown off a meeting to go sailplaning isn't true; IIRC, there was a delay in getting the contract signed because Kildall was recovering from a sailplane crash which happened several days before the deal was to be finalized, but it didn't actually impact the deal significantly.
CP/M-86 wasn't actually ready, and Digital Research wanted a much more lucrative royalty bullet than IBM was willing to bite. They did eventually offer both CP/M-86 and UCSD p-System, but while PC-DOS was $40, CP/M-86 was $240 (probably out of spite since Kildall threatened to sue IBM for going with a house-brand CP/M clone instead of the real deal) and p-System was much more expensive (I don't have a figure on hand but I know it was over $500) and ended up not being particularly future-proof since UCSD Pascal faded away.

It didn't even take a year of PC-DOS being on the market for the largest CP/M program vendor, Lifeboat Associates, to announce a switch to PC-DOS. Within time, Digital Research was porting their CP/M-86 tools to DOS and at one point was giving away free copies of Concurrent CP/M-86 with the purchase of applications for the platform.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Optimizing memory functions with REP [Hooray! Solved]

Post by Schol-R-LEA »

Ah! Thank you for the corrections and clarifications, I didn't know those aspects of it.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Post Reply