To apply for this job, please visit:
Build and supervise the company's SW team.
An important charter of the team is performance analysis to determine system and hardware bottlenecks and provide feedback to the Hardware architects and designers to enhance performance and reduce power consumption for Browser workload. The role includes, but not limited to:
1- Analyze and characterize benchmarks and applications on current architectures and working with software developers and chip architects to implement solutions and optimizations.
2- Debug performance issues from both internal and external customers. The candidate will work very closely with the System, Processor Architecture, and Circuit leads and also engage with SW team, to understand and optimize the overall system architecture for performance and power.
3- Design and development of target specific optimization and code generation for CPU’s and decimal co-processors and(or) design and development of modified compilers.
1- Demonstrated skills in effecting change. Candidate needs to be able to pull together one’s own ideas and suggestions from others into a vision for achieving individual and group objectives. Candidate must then beable to implement this vision with minimal supervision.
2- A thorough understanding of computer architecture, including processors, memory, storage, graphics subsystems, and network.
3- Detailed understanding of the functioning of CPUs - pipeline structure & hazards, cache & memory organization, etc.
4- Expertise with x86 instruction set and architectures, assembly language.
5- Understanding of the compilation challenges and potential solutions for languages like C/C++/Java/Cobol
6- Working knowledge of Operating System concepts, process/thread, memory management, garbage collection algorithms, threads scheduling.
7- Strong technical background in software development techniques, benchmarking and performance analysis. Test automation skills is a plus
8- Strong understanding of databases and financial SW is a plus.
9- Experience with compiler backend in re-targetable compiler frameworks like GCC. LLVM and Open64 are a plus.
10- Preferred to have knowledge of compiler middle end: development of architecture independent optimization technology, e.g Control flow analysis , Data flow analysis, SSA, Global Optimizations, Inter-procedural Optimizations, Loop optimizations, Feedback directed optimizations and performance analysis. Target architectures: homogeneous multi-core( x86) and heterogeneous multi-core( x86 + decimal coprocessor) architectures
11- Experience with profiling tools preferred.
12- Experience in SW transformers is a plus.
Experience must include:
1- C/C++ programming
2- X86 architecture knowledge and assembly code optimization experience.
3- Working knowledge of Linux and Windows.
4- Usage of debuggers, profiling tools, and tools used to measure processor performance.
5- Candidates with strong understanding of compilers and compiler optimizations, codegen analysis, virtual machines would be preferred.
6- Good interpersonal and communication (written and oral) skills
BS and 10+ years experience, MS and 8+ years experience, or PhD and 4+ years experience.
Skills and Keywords
Must have: software, developer, architect, hardware, processor, x86
Important: GCC, optimization, linux, c++
Nice to have: LLVM
SilMinds offers innovative solutions for Decimal Floating Point (DFP) hardware acceleration. We are the first to introduce a library of IP core units that is compliant with IEEE 754-2008 standard. In addition, we offer integrated hardware/software DFP acceleration products and custom solutions to bring our customers the highest levels of high performance and energy efficient computing.
Our patented DFP arithmetic technology serves various financial applications, characterized by intense DFP computations, in a wide range of financial sectors including core banking, telecoms call rating and billing, currency and stock exchange, real time payment processing, and more...
SilMinds coprocessor based accelerator cards (SilAx and SilAxPro) can be directly plugged into the server or the personal computer PCIe interface bus or the CPU Socket. The accelerators' core functions are implemented using reconfigurable FPGAs, offering full customization of the acceleration solution in terms of both computation profile and throughput. The speedup factors achieved through SilAx DFP accelerators promise unprecedented energy and total cost of ownership savings for financial data centers.
Decimal Floating Point Arithmetic (DFPA), Accelerated high performance computing (HPC), Financial data center optimization
To apply for this job, please visit: