log in | register | forums
Show:
Go:
Forums
Username:

Password:

User accounts
Register new account
Forgot password
Forum stats
List of members
Search the forums

Advanced search
Recent discussions
- Iyonix Softload (Gen:4)
- What are you looking forward to at RISC OS London Show (News:4)
- AMCS free versions are live! (Gen:22)
- Apples updates MacOS to Catalina (News:)
- Aemulor (Gen:38)
- RISC OS London Show 2019 (News:)
- September News round-up (News:1)
- Masters of ArtWorks (Gen:6)
- Which Raspberry Pi? (Gen:6)
- Current state of play (Gen:38)
Latest postings RSS Feeds
RSS 2.0 | 1.0 | 0.9
Atom 0.3
Misc RDF | CDF
Site Search
 
Article archives
The Icon Bar: Programming: Code GCC produces that makes you cry #12684
 
  Code GCC produces that makes you cry #12684
  This is a long thread. Click here to view the threaded list.
 
Simon Willcocks Message #124444, posted by Stoppers at 19:00, 13/2/2019, in reply to message #124443
Member
Posts: 292
I'd mixed up what the parameters meant, so the last value was too large, which meant that the function would have returned null...
  ^[ Log in to reply ]
 
Jeffrey Lee Message #124445, posted by Phlamethrower at 19:08, 13/2/2019, in reply to message #124443
PhlamethrowerHot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff

Posts: 15079
(Those first two instructions are interesting, as well!)
As a guess, I'd say you're looking at an unlinked object file, in which case the instruction at 284 could be a placeholder which will later get patched with a proper value by the linker.

The rest probably only makes sense if I knew what the rest of the code looked like smile
  ^[ Log in to reply ]
 
Simon Willcocks Message #124446, posted by Stoppers at 23:22, 13/2/2019, in reply to message #124445
Member
Posts: 292
I wondered about that, too, but:

built_drivers/memory_management.elf: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped

Moving slowly on...
  ^[ Log in to reply ]
 
Jeffrey Lee Message #124502, posted by Phlamethrower at 14:14, 3/6/2019, in reply to message #124446
PhlamethrowerHot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff

Posts: 15079
GCC's NEON auto-vectorisation is abysmal.

This article is a good starting point for all the things you need to tweak to get vectorisation working nicely on x86. But ARM NEON instructions will work happily with word-aligned data, so there won't be any need to specify alignment for your data, right? Wrong.

This produces nice tight code:

#include <stdlib.h>
#include <math.h>

#define SIZE (1L << 16)

void test1(int * __restrict a, int * __restrict b)
{
int i;

int *x = (int *) __builtin_assume_aligned(a, 4);
int *y = (int *) __builtin_assume_aligned(b, 8);

for (i = 0; i < SIZE; i++)
{
x[i] += y[i];
}
}


This also produces nice code:

#include <stdlib.h>
#include <math.h>

#define SIZE (1L << 16)

void test1(int * __restrict a, int * __restrict b)
{
int i;

int *x = (int *) __builtin_assume_aligned(a, 8);
int *y = (int *) __builtin_assume_aligned(b, 4);

for (i = 0; i < SIZE; i++)
{
x[i] += y[i];
}
}


This produces an 80-instruction monstrosity:

#include <stdlib.h>
#include <math.h>

#define SIZE (1L << 16)

void test1(int * __restrict a, int * __restrict b)
{
int i;

int *x = (int *) __builtin_assume_aligned(a, 4);
int *y = (int *) __builtin_assume_aligned(b, 4);

for (i = 0; i < SIZE; i++)
{
x[i] += y[i];
}
}


Yep - one of the pointers needs to be at least doubleword-aligned, otherwise it adds a bunch of extra code to try and deal with imagined alignment issues.

[Edited by Phlamethrower at 14:15, 3/6/2019]
  ^[ Log in to reply ]
 
Jeffrey Lee Message #124503, posted by Phlamethrower at 22:48, 3/6/2019, in reply to message #124502
PhlamethrowerHot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff

Posts: 15079
RISC OS GCC 4.7.4 seems to be a bit more finnicky, requiring at least one of the variables to be 16 byte aligned to avoid nastyness.

I can understand some of its logic - the code will be faster if the pointers are aligned, especially for large buffers, so it makes sense to add extra code to bring one of the pointers into alignment. But why is the extra code so bloody long? You can write the same thing in about half as many instructions.

[Edited by Phlamethrower at 22:48, 3/6/2019]
  ^[ Log in to reply ]
 
Jeffrey Lee Message #124574, posted by Phlamethrower at 12:17, 22/9/2019, in reply to message #124503
PhlamethrowerHot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff

Posts: 15079
New clang optimisations!

http://releases.llvm.org/9.0.0/docs/ReleaseNotes.html#id4

I should really give it a go at some point.
  ^[ Log in to reply ]
 
Pages (2): |< < 2

The Icon Bar: Programming: Code GCC produces that makes you cry #12684