Commit Graph

15333 Commits

Author SHA1 Message Date
ReinUsesLisp 442a1cc021
gl_rasterizer: Re-enable stream buffer memory due to global memory
Global memory is still using the stream buffer when it shouldn't. As a
temporary fix re-enable the stream buffer on compute.
2019-11-02 13:19:19 -03:00
ReinUsesLisp 76ca2a5f82
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements
to a fast. This path has an extra memcpy but updates the buffer without
orphaning or waiting for previous calls. It can be seen as a better
model for "push constants" that can upload a whole UBO instead of 256
bytes.

This path has some requirements established here:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24

Instead of using the stream buffer, this commits moves constant buffers
uploads to calls of glNamedBufferSubData and from my testing it brings a
performance improvement. This is disabled when the vendor is not Nvidia
since it brings performance regressions.
2019-11-02 05:05:34 -03:00
Rodrigo Locatti 11e39da02b
Merge pull request #3054 from FernandoS27/fix-tld4-2
shader_ir: Fix regression on TLD4
2019-10-31 01:56:29 +00:00
Fernando Sahmkow 23cabc98db Shader_IR: Fix regression on TLD4
Originally on the last commit I thought TLD4 acted the same as TLD4S and 
didn't have a mask. It actually does have a component mask. This commit 
corrects that.
2019-10-30 21:14:57 -04:00
Rodrigo Locatti 658489ebf7
Merge pull request #3050 from FernandoS27/fix-tld4
shader_ir: Fix TLD4 and add bindless variant
2019-10-30 18:37:17 +00:00
Fernando Sahmkow 9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
Rodrigo Locatti 04b838c857
Merge pull request #3038 from lioncash/docs
kernel/scheduler: Minor changes
2019-10-30 03:47:28 +00:00
bunnei 2382bbe3ac
Merge pull request #3046 from ReinUsesLisp/clean-gl-state
gl_state: Miscellaneous clean up
2019-10-29 22:50:04 -04:00
bunnei b5138f3c35
Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated
rasterizer_accelerated: Add intermediary for GPU rasterizers
2019-10-29 22:06:41 -04:00
bunnei a81bd962ab
Merge pull request #3007 from DarkLordZach/fsc-regress
savedata_factory: Automatically create certain savedata
2019-10-29 22:05:09 -04:00
Rodrigo Locatti 3d0cde6a75
gl_state: Use std::array::fill instead of std::fill
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-10-30 01:30:31 +00:00
ReinUsesLisp ce20ed8e4e
gl_state: Move dirty checks to individual apply calls instead of Apply
This requires removing constness from some methods, but for consistency
it's removed in all methods.
2019-10-29 21:27:25 -03:00
ReinUsesLisp 3c6557c235
gl_state: Remove ApplyDefaultState
OpenGL has defaults values we can trust. Remove these.
2019-10-29 21:27:25 -03:00
ReinUsesLisp d3651b0b82
gl_state: Change SetDefaultViewports to use default constructor 2019-10-29 21:27:24 -03:00
ReinUsesLisp c7698d0bc8
gl_state: Minor style changes 2019-10-29 21:27:24 -03:00
ReinUsesLisp a14d202ac2
gl_state: Remove unused Citra TextureUnits 2019-10-29 21:27:24 -03:00
ReinUsesLisp 28fece8e9b
gl_state: Move initializers from constructor to class declaration 2019-10-29 21:27:23 -03:00
ReinUsesLisp a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Lioncash 1643af431c externals: Track upstream libzip
Stops relying on a fork for providing zip handling and instead tracks
the upstream branch but keeps any necessary build-related changes in the
source tree directly without modifying the libzip target itself.
2019-10-29 19:52:40 -04:00
Rodrigo Locatti 2ec5b55ee3
Merge pull request #3004 from ReinUsesLisp/maxwell3d-cleanup
maxwell_3d: Remove unused entries
2019-10-29 23:46:33 +00:00
Lioncash c2486f77e4 externals: Amend zlib submodule
Supplies CMakeLists.txt file that avoids pulling in zlib's tests into
the tree. This avoids needing to explicitly opt these tests out from
ctest.
2019-10-29 16:58:23 -04:00
Rodrigo Locatti 9f93ad08a5
Merge pull request #3023 from lioncash/opus
externals: Track upstream opus
2019-10-28 02:45:01 -03:00
Rodrigo Locatti c5d9589942
Merge pull request #3037 from FernandoS27/new-formats
video_core: Implement texture format E5B9G9R9_SHAREDEXP.
2019-10-28 01:36:58 -03:00
Lioncash 6c8f28813c scheduler: Mark parameter of AskForReselectionOrMarkRedundant() as const
This is only compared against, so it can be made const.
2019-10-27 23:35:50 -04:00
ReinUsesLisp fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture 2019-10-28 00:23:42 -03:00
ReinUsesLisp 538ddd220e
video_core/textures: Remove unused index entry in FullTextureInfo 2019-10-28 00:14:38 -03:00
ReinUsesLisp 961fe4d19b
maxwell_3d: Remove unused method GetStageTextures 2019-10-28 00:14:29 -03:00
Lioncash f19c1a7cda scheduler: Silence sign conversion warnings 2019-10-27 22:44:52 -04:00
Lioncash 2fb0bbff29 scheduler: Initialize class members directly where applicable
Reduces the overall amount of code.
2019-10-27 22:13:55 -04:00
Lioncash 2dc469ceba scheduler: Amend documentation comments
Adjusts the formatting of a few of the comments an ensures they get
recognized as proper Doxygen comments.
2019-10-27 22:12:32 -04:00
David 4c5731c34f
Merge pull request #2971 from FernandoS27/new-scheduler-v2
Kernel: Implement a New Thread Scheduler V2
2019-10-28 10:53:27 +11:00
Fernando Sahmkow 3f9262195b Video_Core: Implement texture format E5B9G9R9_SHAREDEXP.
This commit implements the E5B9G9R9 Texture format into the general 
system and OpenGL backend.
2019-10-27 16:44:09 -04:00
bunnei 6909b2f0f9
Merge pull request #3034 from ReinUsesLisp/w4244-maxwell3d
maxwell_3d: Silence implicit conversion warnings
2019-10-27 15:08:59 -04:00
ReinUsesLisp 3e469cecc1
maxwell_3d: Silence implicit conversion warnings
While we are at it, unify types for dirty reg pointers.
2019-10-27 15:22:17 -03:00
bunnei 7e2494e987
Merge pull request #3033 from ReinUsesLisp/w4244-astc
astc: Silence implicit conversion warnings
2019-10-27 14:09:53 -04:00
ReinUsesLisp bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
ReinUsesLisp a5aa1bb174
astc: Silence implicit conversion warnings 2019-10-27 03:04:50 -03:00
Rodrigo Locatti 26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti a0d79085c4
Merge pull request #3027 from lioncash/lookup
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow e3afd6595a Shader_IR: Clang format 2019-10-25 09:01:32 -04:00
ReinUsesLisp 78f3e8a757 gl_shader_cache: Implement locker variants invalidation 2019-10-25 09:01:32 -04:00
ReinUsesLisp ec85648af3 gl_shader_disk_cache: Store and load fast BRX 2019-10-25 09:01:31 -04:00
ReinUsesLisp fa2c297f3e const_buffer_locker: Minor style changes 2019-10-25 09:01:31 -04:00
ReinUsesLisp 7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow 1244f2d368 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:31 -04:00
Fernando Sahmkow a05120ec0b Shader_IR: Correct typo in Consistent method. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow 33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Fernando Sahmkow 8909f52166 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:30 -04:00